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From  the  Publisher 

Software  Sustainment  or  Maintenance? 

I  work  with  a  person  who  cringes  when  he  hears  the  term  software  maintenance.  For  him, 
maintenance  brings  forth  images  of  an  angry  general  asking  why  he  is  paying  to  fix 
a  system  that  he  already  paid  for  and  expects  to  work.  As  you  will  read  in  this  month’s 
issue,  software  maintenance  usually  involves  so  much  more  and  perhaps  the  term  soft¬ 
ware  sustainment  is  more  descriptive. 

As  the  Crosstalk  staff  prepared  this  month’s  selection  of  articles,  I  noticed  one 
software  sustainment  topic  that  warrants  additional  discussion:  knowledge  retention. 
In  order  to  sustain  software  using  the  techniques  discussed  in  this  month’s  issue,  knowledge  of 
the  software/ system  is  needed.  This  required  knowledge  must  be  adequately  planned  in  addition 
to  other  resources  required  for  adequate  logistics  support.  Basically,  someone  has  to  know  how 
the  software  works.  Even  if  the  number  of  fixes,  enhancements,  alterations,  and  other  activities 
are  minimal  and  just  a  few  engineers  could  conceivably  address  these,  the  size  and  complexity 
of  the  system  may  be  quite  large,  requiring  more  people  to  cover  the  needed  knowledge. 

I  am  fortunate  to  work  closely  with  Dr.  Randall  Jensen,  one  of  the  leaders  in  the  software 
estimation  community.  In  a  recent  discussion,  he  pointed  to  heuristics  from  Dr.  Barry  Boehm 
in  his  book,  “Software  Engineering  Economics.”  These  heuristics  assign  complexity  values  to 
various  software  systems  such  as  operating  systems,  accounting  systems,  operational  flight  pro¬ 
grams  (OFF),  etc.  For  example,  an  OFF  is  assigned  a  value  of  10.  This  10  equates  to  (of  all 
things)  boxes  of  cards  that  an  operator  can  handle  for  this  system.  One  can  imagine  this  infor¬ 
mation  is  quite  old  because  computer  cards  are  certainly  before  my  time.  Yet  the  statistic  seems 
to  have  stood  the  test  of  time.  By  allowing  2,000  cards  per  box,  or  2,000  source  lines  of  code 
(KSLOC)  per  number,  a  pqjical  OFF  will  need  a  knowledgeable  person  for  every  20  KSLOC  (2 
KSLOC  X  10). 

Of  course,  it  is  cost  prohibitive  to  have  numerous  people  waiting  around  to  make  few  alter¬ 
ations  to  the  code.  However,  these  people  can  spend  their  excess  time  supporting  other  systems 
that  may  have  oversight  from  other  experts. 

There  are  numerous  other  sustainment  considerations  within  this  month’s  CROSSTALK, 
beginning  with  Capers  Jones’  Geriatric  Issues  of  Aging  Software.  In  his  article,  Jones  provides  a 
much  more  comprehensive  discussion  of  sustainment  issues  that  must  be  considered  when 
planning  a  logistics  effort.  We  next  share  the  F-35  Lightning  II  sustainment  approach,  including 
the  contracting  structure,  in  Performance-Based  Software  Sustainment for  the  F-35  Ughtning  II  by  Lloyd 
Huff  and  George  Novak.  In  our  final  theme  article.  Reference  Metrics  for  Service-Oriented 
Architectures,  Dr.  Yun-Tung  Lau  suggests  including  service  time,  scalability,  availability,  and  relia¬ 
bility  while  measuring  the  usefulness  of  a  fielded  service-oriented  architecture. 

We  offer  two  supporting  articles  this  month,  beginning  with  A  Primer  on  ]ava  Obfuscation  by 
Stephen  Torti,  Derek  Sanders,  Gordon  Evans,  and  Dr.  Drew  Hamilton.  In  this  article,  the 
authors  caution  on  the  attempted  use  of  obfuscation  to  protect  Java  code,  while  adding  some 
suggestions  for  consideration  if  Java  is  truly  required  in  a  secure  system.  Finally,  Alison  A.  Frost 
and  Michael  J.  Campo  discuss  defect  containment  in  Advancing  Defect  Containment  to  Quantitative 
Defect  Management. 

Having  worked  on  multiple  sustainment  efforts,  I  have  experienced  the  daunting  task  of 
understanding  the  code  that  is  being  altered.  What  would  have  often  been  a  simple  job  for  a  sys¬ 
tem  I  was  familiar  with,  became  more  complicated  and  time-consuming  as  I  needed  to  learn 
about  the  software  before  starting  any  changes.  When  this  need  is  added  to  complex  changes 
requiring  requirements  modeling,  contracting,  etc.,  the  list  of  considerations  is  extensive  and 
must  be  approached  with  educated  knowledge. 
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Geriatric  Issues  of  Aging  Software 

Capers  Jones 
Software  Productivity  Research,  LJJS. 


Software  has  been  a  mainstay  of  business  and  government  operations  for  more  than  50  years.  As  a  result,  all  large  enter¬ 
prises  utilie^e  aging  software  in  significant  amounts.  Some  companies  exceed  5,000,000 function  points  in  the  total  volume  of 
their  corporate  software  por folios.  Much  of  this  software  is  now  more  than  10  years  old,  and  some  applications  are  more 
than  25  years  old.  Maintenance  of  aging  software  tends  to  become  more  difficult  year  by  year  since  updates  gradually  destroy 
the  original  structure  of  the  applications  and  increase  its  entropy.  Aging  software  may  also  contain  troublesome  regions  with 
very  high  error  densities  called  error-prone  modules.  Repairs  to  aging  software  suffer  from  a  phenomenon  called  bad  fix  injec¬ 
tion,  or  new  defects  are  accidentally  introduced  as  a  byproduct  of  fixing  previous  defects. 


As  the  21st  century  advances,  more 
than  50  percent  of  the  global  soft¬ 
ware  population  is  engaged  in  modifying 
existing  applications  rather  than  writing 
new  applications.  This  fact  by  itself 
should  not  be  a  surprise  because  whenev¬ 
er  an  industry  has  more  than  50  years  of 
product  experience,  the  personnel  who 
repair  existing  products  tend  to  outnum¬ 
ber  the  personnel  who  build  new  prod¬ 
ucts.  For  example,  there  are  more  automo¬ 
bile  mechanics  in  the  United  States  who 
repair  automobiles  than  there  are  person¬ 
nel  employed  in  building  new  automo¬ 
biles. 

The  imbalance  between  software 
development  and  maintenance  is  opening 
up  new  business  opportunities  for  soft¬ 
ware  outsourcing  groups.  It  is  also  gener¬ 
ating  a  significant  burst  of  research  into 
tools  and  methods  for  improving  software 
maintenance  performance. 

What  Is  Software 
Maintenance? 

The  word  maintenance  is  surprisingly 
ambiguous  in  a  software  context.  In  nor¬ 
mal  usage,  it  can  span  some  23  forms  of 
modification  to  existing  applications.  The 
two  most  common  meanings  of  the  word 
maintenance  include  the  following:  1) 
defect  repairs,  and  2)  enhancements  (or 
adding  new  features  to  existing  software 
applications). 


Although  software  enhancements  and 
software  maintenance  in  the  sense  of 
defect  repairs  are  usually  funded  in  differ¬ 
ent  ways  and  have  quite  different  sets  of 
activity  patterns  associated  with  them, 
many  companies  lump  these  disparate 
software  activities  together  for  budgets 
and  cost  estimates. 

The  author  does  not  recommend  the 
practice  of  aggregating  defect  repairs  and 
enhancements,  but  this  practice  is  very 
common.  Consider  some  of  the  basic  dif¬ 
ferences  between  enhancements  or  adding 
new  features  to  applications  and  mainte¬ 
nance  or  defect  repairs  as  shown  in  Table  1 . 

Because  the  general  topic  of  mainte¬ 
nance  is  so  complicated  and  includes  so 
many  different  kinds  of  work,  some  com¬ 
panies  merely  lump  aU  forms  of  mainte¬ 
nance  together,  using  gross  metrics  such  as 
the  overall  percentage  of  annual  software 
budgets  devoted  to  aU  forms  of  mainte¬ 
nance  summed  together.  This  method  is 
crude,  but  can  convey  useful  information. 
An  organization  that  is  proactive  in  using 
geriatric  tools  and  services  can  spend  less 
than  30  percent  of  its  annual  software 
budget  on  various  forms  of  maintenance, 
while  an  organization  that  has  not  used 
any  of  the  geriatric  tools  and  services  can 
top  60  percent  of  its  annual  budget  on  var¬ 
ious  forms  of  maintenance. 

The  kinds  of  maintenance  tools  used 
by  lagging,  average,  and  leading  organiza¬ 


tions  are  shown  in  Table  2.  Table  2  is  part 
of  a  larger  study  that  examined  many  dif¬ 
ferent  kinds  of  software  engineering  and 
project  management  tools  [1]. 

It  is  interesting  that  the  leading  com¬ 
panies  in  terms  of  maintenance  sophisti¬ 
cation  not  only  use  more  tools  than  the 
laggards,  but  they  use  more  of  their  fea¬ 
tures  as  well.  Again,  the  function  point 
values  in  Table  2  refer  to  the  capabilities  of 
the  tools  that  are  used  in  day-to-day  main¬ 
tenance  operations.  The  leaders  not  only 
use  more  tools,  but  they  do  more  with 
them. 

Before  proceeding,  let  us  consider  23 
discrete  topics  that  are  often  coupled 
together  under  the  generic  term  mainte¬ 
nance  in  day-to-day  discussions,  but  which 
are  actually  quite  different  in  many  impor¬ 
tant  respects  [2]  (See  Table  3  for  the  list  of 
23  topics). 

Although  the  23  maintenance  topics 
are  different  in  many  respects,  they  all 
have  one  common  feature  that  makes  a 
group  discussion  possible:  They  all 
involve  modifying  an  existing  application 
rather  than  starting  from  scratch  with  a 
new  application. 

Each  of  the  23  forms  of  modifying 
existing  applications  has  a  different  rea¬ 
sons  for  being  carried  out.  However,  it 
often  happens  that  several  of  them  take 
place  concurrently.  For  example,  enhance¬ 
ments  and  defect  repairs  are  very  common 
in  the  same  release  of  an  evolving  applica¬ 
tion.  There  are  also  common  sequences  or 
patterns  to  these  modification  activities. 
For  example,  reverse  engineering  often 
precedes  reengineering  and  the  two  occur 
so  often  together  as  to  almost  comprise  a 
linked  set.  For  releases  of  large  applica¬ 
tions  and  major  systems,  the  author  has 
observed  between  six  and  10  forms  of 
maintenance  aU  leading  up  to  the  same 
release. 


Table  1:  Key  Differences  Between  Maintenance  and  Pinhancements 


Enhancements 

Maintenance 

(New  features) 

(Defect  repairs) 

Funding  source 

Clients 

Absorbed 

Requirements 

Formal 

None 

Specifications 

Formal 

None 

Inspections 

Formal 

None 

User  documentation 

Formal 

None 

New  function  testing 

Formal 

None 

Regression  testing 

Formal 

Minimal 
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Maintenance  Engineering 

Lagging 

Average 

Leading 

Reverse  engineering 

1,000 

3,000 

Reengineering 

1,250 

3,000 

Code  restructuring 

1,500 

Configuration  controi 

500 

1,000 

2,000 

Test  support 

500 

1,500 

Customer  support 

750 

1,250 

Debugging  toois 

750 

750 

1,250 

Defect  tracking 

500 

750 

1,000 

Compiexity  analysis 

1,000 

Mass  update  search  engines 

500 

1,000 

Function  point  subtotal 

1,750 

6,500 

16,500 

Number  of  tools 

3 

8 

10 

Table  2:  Numbers  and  Sit^e  Ranges  of  Maintenance  Engineering  Tools  (Si^e  data  expressed  in  terms 
of  function  point  metrics) 


Geriatric  Problems  of  Aging 
Software 

Once  software  is  put  into  production  it 
continues  to  change  in  three  important 
ways: 

1.  Latent  defects  still  present  at  release 
must  be  found  and  fixed  after  deploy¬ 
ment. 

2.  Applications  continue  to  grow  and  add 
new  features  at  a  rate  of  between  5 
percent  and  10  percent  per  calendar 
year,  due  either  to  changes  in  business 
needs  or  to  new  laws  and  regulations, 
or  both. 

3.  The  combination  of  defect  repairs  and 
enhancements  tends  to  gradually 
degrade  the  structure  and  increase  the 
complexity  of  the  application.  The 
term  for  this  increase  in  complexity 
over  time  is  called  entropy.  The  average 
rate  at  which  software  entropy  increas¬ 
es  is  about  1  percent  to  3  percent  per 
calendar  year. 

Because  software  defect  removal  and 
quality  control  are  imperfect,  there  wiU 
always  be  bugs  or  defects  to  repair  in 
delivered  software  applications.  The  cur¬ 
rent  U.S.  average  for  defect  removal  effi¬ 
ciency  is  only  about  85  percent  of  the 
bugs  or  defects  introduced  during  devel¬ 
opment  [3]  and  has  stayed  almost  the 
same  for  more  than  10  years.  The  actual 
values  are  about  five  bugs  per  function 
point  created  during  development.  If  85 
percent  of  these  are  found  before  release, 
about  0.75  bugs  per  function  point  will  be 
released  to  customers.  For  a  typical  appli¬ 
cation  of  1,000  function  points  or  100,000 
source  code  statements,  that  implies  about 
750  defects  present  at  delivery.  About 
one -third  -  or  250  defects  -  will  be  serious 
enough  to  stop  the  application  from  run¬ 
ning  or  create  erroneous  outputs. 

Since  defect  potentials  tend  to  rise 
with  the  overall  size  of  the  application, 
and  since  defect  removal  efficiency  levels 
tend  to  decline  with  the  overall  size  of  the 
application,  the  overall  volume  of  latent 
defects  delivered  with  the  application  rises 
with  size.  This  explains  why  super-large 
applications  in  the  range  of  100,000  func¬ 
tion  points,  such  as  Microsoft  Windows 
and  many  enterprise  resource  planning 
(ERP)  applications,  may  require  years  to 
reach  a  point  of  relative  stability.  These 
large  systems  are  delivered  with  thousands 
of  latent  bugs  or  defects. 

Not  only  is  software  deployed  with  a 
significant  volume  of  latent  defects,  but  a 
phenomenon  called  bad  fix  injection  has 
been  observed  for  more  than  50  years. 
Roughly  7  percent  of  all  defect  repairs  wiU 
contain  a  new  defect  that  was  not  there 


before.  For  very  complex  and  poorly 
structured  applications,  these  bad-fix 
injections  have  topped  20  percent  [3]. 

In  the  1970s,  IBM  did  a  distribution 
analysis  of  customer-reported  defects 
against  their  main  commercial  software 
appUcations.  The  IBM  personnel  involved 
in  the  study,  including  the  author,  were 
surprised  to  find  that  defects  were  not 
randomly  distributed  through  aU  of  the 
modules  of  large  appUcations  [4]. 

In  the  case  of  IBM’s  main  operating 
system,  about  5  percent  of  the  modules 
contained  just  over  50  percent  of  aU 
reported  defects.  The  most  extreme  exam¬ 
ple  was  a  large  database  appUcation,  where 
31  modules  out  of  425  contained  more 
than  60  percent  of  aU  customer-reported 
bugs.  These  troublesome  areas  were 
known  as  errorprone  moduks. 

Similar  studies  by  other  corporations 


such  as  AT&T  and  ITT  found  that  error- 
prone  modules  were  endemic  in  the  soft¬ 
ware  domain.  More  than  90  percent  of 
appUcations  larger  than  5,000  function 
points  were  found  to  contain  error-prone 
modules  in  the  1980s  and  early  1990s. 
Summaries  of  the  error-prone  module 
data  from  a  number  of  companies  was 
pubUshed  in  [3] . 

Fortunately,  it  is  possible  to  surgicaUy 
remove  error-prone  modules  once  they 
are  identified.  It  is  also  possible  to  prevent 
them  from  occurring.  A  combination  of 
defect  measurements,  formal  design 
inspections,  formal  code  inspections,  and 
formal  testing  and  test-coverage  analysis 
have  proven  to  be  effective  in  preventing 
error-prone  modules  from  coming  into 
existence  [5]. 

Today  in  2007,  error-prone  modules 
are  almost  nonexistent  in  organizations 


Table  3:  Major  Kinds  of  Work  Tefiormed  Under  the  Generic  Term  Maintenance 

Major  Kinds  of  Work  Performed  Under  the  Generic  Term  Maintenance 

1 .  Major  enhancements  (new  features  of  >  20  function  points). 

2.  Minor  enhancements  (new  features  of  <  5  function  points). 

3.  Maintenance  (repairing  defects  for  good  wiii). 

4.  Warranty  repairs  (repairing  defects  under  formai  contract). 

5.  Customer  support  (responding  to  ciient  phone  cails  or  probiem  reports). 

6.  Error-prone  moduie  removal  (eliminating  very  troublesome  code  segments). 

7.  Mandatory  changes  (required  or  statutory  changes). 

8.  Complexity  or  structural  analysis  (charting  control  flow  plus  complexity  metrics). 

9.  Code  restructuring  (reducing  cyclomatic  and  essential  complexity). 

10.  Optimization  (increasing  performance  or  throughput). 

1 1 .  Migration  (moving  software  from  one  platform  to  another). 

12.  Conversion  (changing  the  interface  or  file  structure). 

13.  Reverse  engineering  (extracting  latent  design  information  from  code). 

14.  Reengineering  (transforming  legacy  application  to  modern  forms). 

15.  Dead  code  removal  (removing  segments  no  longer  utilized). 

16.  Dormant  application  elimination  (archiving  unused  software). 

17.  Nationalization  (modifying  software  for  international  use). 

1 8.  Mass  updates  such  as  the  Euro  or  Year  2000  (Y2K)  repairs. 

19.  Refactoring,  or  reprogramming,  applications  to  improve  clarity. 

20.  Retirement  (withdrawing  an  application  from  active  service). 

21.  Field  service  (sending  maintenance  members  to  client  locations). 

22.  Reporting  bugs  or  defects  to  software  vendors. 

23.  Installing  updates  received  from  software  vendors. 
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that  are  higher  than  Level  3  on  the 
Software  Engineering  Institute’s  Capabil¬ 
ity  Maturity  Model®  (CMM®).  However, 
they  remain  common  and  troublesome  for 
Level  1  organizations  and  for  organiza¬ 
tions  that  lack  sophisticated  quality  mea¬ 
surements  and  quality  control. 

If  the  author’s  clients  are  representa¬ 
tive  of  the  United  States  as  a  whole,  more 
than  50  percent  of  U.S.  companies  still  do 
not  utilize  the  CMM  at  aU.  Of  those  who 
do  use  the  CMM,  less  than  1 5  percent  are 
at  Level  3  or  higher.  That  implies  that 
error-prone  modules  may  exist  in  more 
than  half  of  aU  large  corporations  and  in 
a  majority  of  state  government  software 
applications  as  well. 

Once  deployed,  most  software  applica¬ 
tions  continue  to  grow  at  annual  rates  of 
between  5  percent  and  10  percent  of  their 
original  functionality.  Some  applications, 
such  as  Microsoft  Windows,  have 
increased  in  size  by  several  hundred  per¬ 
cent  over  a  10-year  period. 

The  combination  of  continuous 
growth  of  new  features  coupled  with  con¬ 
tinuous  defect  repairs  tends  to  drive  up 
the  complexity  levels  of  aging  software 
applications.  Structural  complexity  can  be 


Table  4:  Impact  of  Kej  Adjustment  Factors  on 
Maintenance  (sorted  in  order  of  maximum  posi¬ 
tive  impact) 


Maintenance  Factors 

Plus 

Range 

Maintenance  specialists 

35% 

High  staff  experience 

34% 

Table-driven  variables  and  data 

33% 

Low  complexity  of  base  code 

32% 

Test  coverage  tools  and 
analysis 

30% 

Code  restructuring  tools 

29% 

Reengineering  tools 

27% 

High-level  programming 
languages 

25% 

Reverse  engineering  tools 

23% 

Complexity  analysis  tools 

20% 

Defect  tracking  tools 

20% 

Mass  update  specialists 

20% 

Automated  change  control  tools 

18% 

Unpaid  overtime 

18% 

Quality  measurements 

16% 

Formal  base  code  inspections 

15% 

Regression  test  libraries 

15% 

Excellent  response  time 

12% 

Annual  training  of  >  10  days 

12% 

High  management  experience 

12% 

Help-desk  automation 

12% 

No  error  prone  modules 

10% 

Online  defect  reporting 

10% 

Productivity  measurements 

8% 

Excellent  ease  of  use 

7% 

User  satisfaction  measurements 

5% 

High  team  morale 

5% 

Sum 

503% 

measured  via  metrics  such  as  cyclomatic 
and  essential  complexity  using  a  number 
of  commercial  tools.  If  complexity  is 
measured  on  an  annual  basis  and  there  is 
no  deliberate  attempt  to  keep  complexity 
low,  the  rate  of  increase  is  between  1  per¬ 
cent  and  3  percent  per  calendar  year. 

However  —  and  this  is  an  important 
fact  —  the  rate  at  which  entropy  or  com¬ 
plexity  increases  is  direcdy  proportional  to 
the  initial  complexity  of  the  application. 
For  example,  if  an  application  is  released 
with  an  average  cyclomatic  complexity 
level  of  less  than  10,  it  will  tend  to  stay 
well  structured  for  at  least  five  years  of 
normal  maintenance  and  enhancement 
changes. 

But  if  an  application  is  released  with 
an  average  cyclomatic  complexity  level  of 
more  than  20,  its  structure  will  degrade 
rapidly  and  its  complexity  levels  might 
increase  by  more  than  2  percent  per  year. 
The  rate  of  entropy  and  complexity  will 
even  accelerate  after  a  few  years. 

As  it  happens,  both  bad-fix  injections 
and  error-prone  modules  tend  to  correlate 
strongly  (although  not  perfectly)  with  high 
levels  of  complexity.  A  majority  of  error- 
prone  modules  have  cyclomatic  complexity 
levels  of  10  or  higher.  Bad-fix  injection  lev¬ 
els  for  modifying  high-complexity  applica¬ 
tions  are  often  higher  than  10  percent. 

In  the  late  1990s,  a  special  kind  of 
geriatric  issue  occurred  which  involved 
making  simultaneous  changes  to  thou¬ 
sands  of  software  applications.  The  first 
of  these  mass  update  geriatric  issues  was  the 
deployment  of  the  Euro  currency,  which 
required  changes  to  currency  conversion 
routines  in  thousands  of  applications.  The 
Euro  was  followed  almost  immediately  by 
the  dreaded  Y2K  (Year  2000)  problem  [6], 
which  also  involved  mass  updates  of 
thousands  of  applications.  More  recently 
in  March  of  2007,  another  such  issue 
occurred  when  the  starting  date  of  day¬ 
light  savings  time  was  changed. 

Future  mass  updates  will  occur  later  in 
the  century  when  it  may  be  necessary  to 
add  another  digit  to  telephone  numbers  or 
area  codes.  Yet  another  and  very  serious 
mass  update  will  occur  if  it  becomes  nec¬ 
essary  to  add  digits  to  social  security  num¬ 
bers  in  the  second  half  of  the  21st  centu¬ 
ry.  There  is  also  the  potential  problem  of 
the  Unix  time  clock  expiration  in  2038. 

Metrics  Problems  With  Small 
Maintenance  Projects 

There  are  several  difficulties  in  exploring 

®  Capability  Maturity  Model  and  CMM  are  registered  in  the 

U.S.  Patent  and  Trademark  Office  by  Carnegie  Mellon 

University. 


software  maintenance  costs  with  accuracy. 
One  of  these  difficulties  is  the  fact  that 
maintenance  tasks  are  often  assigned  to 
development  personnel  who  interweave 
both  development  and  maintenance  as  the 
need  arises.  This  practice  makes  it  difficult 
to  distinguish  maintenance  costs  from 
development  costs  because  the  program¬ 
mers  are  often  rather  careless  in  recording 
how  time  is  spent. 

Another  and  very  significant  problem 
is  the  fact  that  a  great  deal  of  software 
maintenance  consists  of  making  very 
small  changes  to  software  applications. 
Quite  a  few  bug  repairs  may  involve  fixing 
only  a  single  line  of  code.  Adding  minor 
new  features,  such  as  a  new  line-item  on  a 
screen,  may  require  less  than  50  source 
code  statements. 

These  small  changes  are  below  the 
effective  lower  limit  for  counting  function 
point  metrics.  The  function  point  metric 
includes  weighting  factors  for  complexity, 
and  even  if  the  complexity  adjustments 
are  set  to  the  lowest  possible  point  on  the 
scale,  it  is  still  difficult  to  count  function 
points  below  a  level  of  perhaps  15  func¬ 
tion  points  [7] . 

Quite  a  few  maintenance  tasks  involve 
changes  that  are  either  a  fraction  of  a 
function  point,  or  may  at  most  be  less 
than  10  function  points  or  about  1,000 
COBOL  source  code  statements. 
Although  normal  counting  of  function 
points  is  not  feasible  for  small  updates,  it 
is  possible  to  use  the  backfiring  method  or 
converting  counts  of  logical  source  code 
statements  into  equivalent  function  points. 
For  example,  suppose  an  update  requires 
adding  100  COBOL  statements  to  an 
existing  application.  Since  it  usually  takes 
about  105  COBOL  statements  in  the  pro¬ 
cedure  and  data  divisions  to  encode  one 
function  point,  it  can  be  stated  that  this 
small  maintenance  project  is  about  one fiunc- 
tion  point  in  sir^e. 

If  the  project  takes  one  work  day  con¬ 
sisting  of  six  hours,  then  at  least  the 
results  can  be  expressed  using  common 
metrics.  In  this  case,  the  results  would  be 
roughly  six  staff  hours  per  function  point. 
If  the  reciprocal  metric  function  points  per 
staff  month  is  used,  and  there  are  20  work¬ 
ing  days  in  the  month,  then  the  results 
would  be  20  function  points  per  staff  month. 

Best  and  Worst  Practices  in 
Software  Maintenance 

Because  maintenance  of  aging  legacy  soft¬ 
ware  is  labor  intensive,  it  is  quite  impor¬ 
tant  to  explore  the  best  and  most  cost 
effective  methods  available  for  dealing 
with  the  millions  of  applications  that  cur- 
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rently  exist.  The  sets  of  best  and  worst 
practices  are  not  symmetrical.  For  exam¬ 
ple,  the  practice  that  has  the  most  positive 
impact  on  maintenance  productivity  is 
the  use  of  trained  maintenance  experts. 
However,  the  factor  that  has  the  greatest 
negative  impact  is  the  presence  of  error- 
prone  modules  in  the  application  that  is 
being  maintained. 

Table  4  illustrates  a  number  of  factors 
which  have  been  found  to  exert  a  benefi¬ 
cial  positive  impact  on  the  work  of  updat¬ 
ing  aging  applications  and  shows  the  per¬ 
centage  of  improvement  compared  to 
average  results. 

At  the  top  of  the  list  of  maintenance 
best  practices  is  the  utilization  of  full-time, 
trained  maintenance  specialists  rather 
than  turning  over  maintenance  tasks  to 
the  untrained  generalists.  Trained  mainte¬ 
nance  specialists  are  found  most  often  in 
two  kinds  of  companies:  1)  large  systems 
software  producers  such  as  IBM,  and  2) 
large  maintenance  outsource  vendors. 
The  curricula  for  training  maintenance 
personnel  can  include  more  than  a  dozen 
topics  and  the  training  periods  range 
from  two  weeks  to  a  maximum  of  about 
four  weeks. 

Since  training  of  maintenance  special¬ 
ists  is  the  top  factor.  Table  5  shows  a 
modern  maintenance  curriculum  such  as 
those  found  in  large  maintenance  out¬ 
source  companies. 

The  positive  impact  from  utilizing 
maintenance  specialists  is  one  of  the  rea¬ 
sons  why  maintenance  outsourcing  has 
been  growing  so  rapidly.  The  mainte¬ 
nance  productivity  rates  of  some  of  the 
better  maintenance  outsource  companies 
is  roughly  twice  that  of  their  clients  prior 
to  the  completion  of  the  outsource  agree¬ 
ment.  Thus,  even  if  the  outsource  vendor 
costs  are  somewhat  higher,  there  can  stiU 
be  useful  economic  gains. 

Let  us  now  consider  some  of  the  fac¬ 
tors  that  exert  a  negative  impact  on  the 
work  of  updating  or  modifying  existing 
software  applications.  Note  that  the  top- 
ranked  factor  that  reduces  maintenance 
productivity,  the  presence  of  error-prone 
modules,  is  very  asymmetrical.  The 
absence  of  error-prone  modules  does  not 
speed  up  maintenance  work,  but  their 
presence  definitely  slows  down  mainte¬ 
nance  work. 

In  general,  more  than  80  percent  of 
latent  bugs  found  by  users  in  software 
applications  are  reported  against  less  than 
20  percent  of  the  modules.  Once  these 
modules  are  identified  then  they  can  be 
inspected,  analyzed,  and  restructured  to 
reduce  their  error  content  down  to  safe 
levels. 


Software  Maintenance  Courses 

Days 

Sequence 

Error-Prone  Module  Removal 

2.00 

1 

Complexity  Analysis  and  Reduction 

1.00 

2 

Reducing  Bad  Fix  Injections 

1.00 

3 

Defect  Reporting  and  Analysis 

0.50 

4 

Change  Control 

1.00 

5 

Configuration  Control 

1.00 

6 

Software  Maintenance  Workflows 

1.00 

7 

Mass  Updates  to  Multiple  Applications 

1.00 

8 

Maintenance  of  Commercial  Off-The-Shelf  Packages 

1.00 

9 

Maintenance  of  ERP  Applications 

1.00 

10 

Regression  Testing 

2.00 

11 

Test  Library  Control 

2.00 

12 

Test  Case  Conflicts  and  Errors 

2.00 

13 

Dead  Code  Isolation 

1.00 

14 

Function  Points  for  Maintenance 

0.50 

15 

Reverse  Engineering 

1.00 

16 

Reengineering 

1.00 

17 

Refactoring 

0.50 

18 

Maintenance  of  Reusable  Code 

1.00 

19 

Object-Oriented  Maintenance 

1.00 

20 

Maintenance  of  Agile  and  Extreme  Code 

1.00 

21 

TOTAL 

23.50 

Table  5:  Sample  Maintenance  Curricula  for  Companies  Using  Maintenance  Specialists 


Table  6  summarizes  the  major  factors 
that  degrade  software  maintenance  per¬ 
formance.  Not  only  are  error-prone  mod¬ 
ules  troublesome,  but  many  other  factors 
can  degrade  performance  too.  For  exam¬ 
ple,  very  complex  spaghetti  code  is  quite  dif¬ 
ficult  to  maintain  safely.  It  is  also  trouble¬ 
some  to  have  maintenance  tasks  assigned 
to  generalists  rather  than  to  trained  main¬ 
tenance  specialists. 

A  common  situation  that  often 
degrades  performance  is  lack  of  suitable 
maintenance  tools,  such  as  defect  tracking 
software,  change  management  software, 
test  library  software,  and  so  forth.  In  gen¬ 
eral,  it  is  easy  to  botch-up  maintenance 
and  make  it  such  a  labor-intensive  activity 
that  few  resources  are  left  over  for  devel¬ 
opment  work. 

The  last  factor  in  Table  6,  no  unpaid 
overtime,  deserves  a  comment.  Unpaid 
overtime  is  common  among  software 
maintenance  and  development  personnel. 
In  some  companies  it  amounts  to  about 
15  percent  of  the  total  work  time. 
Because  it  is  unpaid  it  is  usually  unmea¬ 
sured.  That  means  side-by-side  compar¬ 
isons  of  productivity  rates  or  costs 
between  groups  with  unpaid  overtime 
and  groups  without  will  favor  the  group 
with  unpaid  overtime  because  so  much  of 
their  work  is  uncompensated  and,  hence, 
invisible.  This  is  a  benchmarking  trap  for 


Maintenance  Factors 

Minus 

Range 

Error-prone  modules 

-50% 

Embedded  variables  and  data 

-45% 

Staff  inexperience 

-40% 

High  complexity  of  base  code 

-30% 

Lack  of  test  coverage  analysis 

-28% 

Manual  change  control  methods 

-27% 

Low-level  programming 
languages 

-25% 

No  defect  tracking  tools 

-24% 

No  mass  update  specialists 

-22% 

Poor  ease  of  use 

-18% 

No  quality  measurements 

-18% 

No  maintenance  specialists 

-18% 

Poor  response  time 

-16% 

Management  inexperience 

-15% 

No  base  code  inspections 

-15% 

No  regression  test  libraries 

-15% 

No  help-desk  automation 

-15% 

No  on-line  defect  reporting 

-12% 

No  annual  training 

-10% 

No  code  restructuring  tools 

-10% 

No  reengineering  tools 

-10% 

No  reverse  engineering  tools 

-10% 

No  complexity  analysis  tools 

-10% 

No  productivity  measurements 

-7% 

Poor  team  morale 

-6% 

No  user  satisfaction 
measurements 

-4% 

No  unpaid  overtime 

0% 

Sum 

-500% 

Table  6:  Impact  of  Key  Adjustment  Factors  on 
Maintenance  (sorted  in  order  of  maximum  nega¬ 
tive  impact) 
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the  unwary.  Because  excessive  overtime  is 
psychologically  harmful  if  continued  over 
long  periods,  it  is  unfortunate  that  unpaid 
overtime  tends  to  be  ignored  when 
benchmark  studies  are  performed. 

Given  the  enormous  amount  of 
effort  that  is  now  being  applied  to  soft¬ 
ware  maintenance,  and  which  will  be 
applied  in  the  future,  it  is  obvious  that 
every  corporation  should  attempt  to 
adopt  maintenance  best  practices  and  avoid 
maintenance  worst  practices  as  rapidly  as 
possible. 

Software  Entropy  and  Total 
Cost  of  Ownership 

The  word  entropy  means  the  tendency  of 
systems  to  destabilize  and  become  more 
chaotic  over  time.  Entropy  is  a  term  from 
physics  and  is  not  a  software-related 
word.  However,  entropy  is  true  of  all 
complex  systems,  including  software.  All 
known  compound  objects  decay  and 
become  more  complex  with  the  passage 
of  time  unless  effort  is  exerted  to  keep 
them  repaired  and  updated.  Software  is 
no  exception.  The  accumulation  of  small 
updates  over  time  tends  to  gradually 
degrade  the  initial  structure  of  applica¬ 
tions  and  makes  changes  grow  more  dif¬ 
ficult  over  time. 

For  software  applications,  entropy  has 
long  been  a  fact  of  life.  If  applications  are 
developed  with  marginal  initial  quality 
control  they  will  probably  be  poorly 
structured  and  contain  error-prone  mod¬ 
ules.  This  means  that  every  year,  the  accu¬ 
mulation  of  defect  repairs  and  mainte¬ 
nance  updates  will  degrade  the  original 
structure  and  make  each  change  slightly 
more  difficult.  Over  time,  the  application 
will  destabilize  and  bad  fixes  will  increase 
in  number  and  severity.  Unless  the  appli¬ 
cation  is  restructured  or  fully  refurbished, 
it  eventually  will  become  so  complex  that 
maintenance  can  only  be  performed  by  a 
few  experts  who  are  more  or  less  locked 
into  the  application. 

By  contrast,  leading  applications  that 
are  well  structured  initially  can  delay  the 
onset  of  entropy.  Indeed,  well-structured 
applications  can  achieve  declining  mainte¬ 
nance  costs  over  time.  This  is  because 
updates  do  not  degrade  the  original  struc¬ 
ture,  as  happens  in  the  case  of  spaghetti 
bowl  applications  where  the  structure  is 
almost  unintelligible  when  maintenance 
begins. 

The  total  cost  of  ownership  of  a  soft¬ 
ware  application  is  the  sum  of  six  major 
expense  elements:  1)  the  initial  cost  of 
building  an  application,  2)  the  cost  of 
enhancing  the  application  with  new  fea¬ 


tures  over  its  lifetime,  3)  the  cost  of 
repairing  defects  and  bugs  over  the  appli¬ 
cation’s  lifetime,  4)  the  cost  of  customer 
support  for  fielding  and  responding  to 
queries  and  customer-reported  defects,  5) 
the  cost  of  periodic  restructuring  or  refac¬ 
toring  of  aging  applications  to  reduce 
entropy  and  thereby  reduce  bad-fix  injec¬ 
tion  rates,  and  6)  removal  of  error-prone 
modules  via  surgical  removal  and  redevel¬ 
opment.  This  last  expense  element  will 
only  occur  for  legacy  applications  that 
contain  error-prone  modules. 

Similar  phenomena  can  be  observed 
outside  of  software.  Hypothetically,  if 
you  buy  an  automobile  that  has  a  high 
frequency  of  repair  as  shown  in 
Consumer  Reports  and  you  skimp  on 
lubrication  and  routine  maintenance,  you 
will  fairly  soon  face  some  major  repair 
problems  -  usually  well  before  50,000 
miles.  By  contrast,  if  you  buy  an  automo¬ 
bile  with  a  low  frequency  of  repair  as 
shown  in  Consumer  Reports  and  you  are 
scrupulous  in  maintenance,  you  should  be 
able  to  drive  the  car  more  than  100,000 
miles  without  major  repair  problems. 

Summary  and  Conclusions 

In  every  industry,  maintenance  tends  to 
require  more  personnel  than  building  new 
products.  For  the  software  industry,  the 
number  of  personnel  required  to  per¬ 
form  maintenance  is  unusually  large  and 
may  soon  top  70  percent  of  all  technical 
software  workers.  The  main  reasons  for 
the  high  maintenance  efforts  in  the  soft¬ 
ware  industry  are  the  intrinsic  difficulties 
of  working  with  aging  software.  Special 
factors  such  as  mass  updates  that  began 
with  the  roll-out  of  the  Euro  and  the 
Y2K  problem  are  also  geriatric  issues. 

Given  the  enormous  efforts  and  costs 
devoted  to  software  maintenance,  every 
company  should  evaluate  and  consider 
best  practices  for  maintenance  and 
should  avoid  worst  practices  if  at  all  pos¬ 
sible.  ♦ 
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The  complexity  and  sophistication  of  F-35  Air  System  software  and  the  multiplicity  of  F-35  missions,  versions,  and  cus¬ 
tomers,  combined  with  a  performance-based  contract  structure,  present  unprecedented  software  sustainment  challenges. 
Understanding  how  F-35  software  will  be  sustained  is  the  focus  of  ongoing  analysis  and planning.  This  article  describes  some 
of  the  revolutionary  conclusions  and products  of  that  analysis  and provides  a  look  forward  to  performance-based  sustainment 
of  software  for  the  multinational  F-35  fleet. 


In  mid-2007,  an  article  entitled  “Lockheed 
Martin  Hopes  F-35  Leads  to 
Maintenance  Revolution”  [1]  was  released 
through  internet  news  outlets.  The  article 
stated  the  following: 


Lockheed  Martin  ...  is  designing  a 
new  kind  of  maintenance  program 
for  the  $300  billion  F-35  Joint  Strike 
Fighter  project,  which  company 
officials  say  could  set  a  new  standard 
for  military  aircraft  operations.  The 
U.S.-led,  nine-nation  fighter  pro¬ 
gram  has  “performance-based  logis¬ 
tics”  built  into  its  purchase  plan,  giv¬ 
ing  contractors  a  big  role  in  mainte¬ 
nance  management.  ...  Lockheed 
Martin  says  this  maintenance  strate¬ 
gy  means  that  logistical  support  will 
make  up  about  50  percent  of  total 
program  costs,  compared  to  67  per¬ 
cent  of  total  costs  under  a  less  cen¬ 
tralized  strategy  ...  Under  the  main¬ 
tenance  plan,  F-35  owners  will  pay 
for  anticipated  operating  time.  ... 
Based  on  how  much  the  aircraft  are 
expected  to  fly,  Lockheed  Martin 
will  manage  parts  inventory,  plan 
overhaul  schedules  and  train  the  mil¬ 
itary  crews  who  support  aircraft 
operations.  ...  The  F-35  program 
marks  the  first  time  an  entire  aircraft 
has  used  a  performance-based  logis¬ 
tics  plan.  Such  pay-by-the-flight- 
hour  maintenance  strategies  are 
more  common  for  component  sys¬ 
tems,  or  commercial  jet  operations. 


The  ramifications  of  this  maintenance  rev¬ 
olution  to  software  engineering  and  manage¬ 
ment  are  extensive.  Initial  planning  for  sus¬ 
tainment  of  F-35  software  began  in  2002. 
In  2005,  however,  the  Office  of  the 
Secretary  of  Defense  confirmed  that  a  per¬ 
formance-based  sustainment  approach 
would  be  applied  to  the  F-35  Joint  Strike 
Fighter  (JSF)  program.  This  decision 
focused  the  planning  phase  and  allowed 
more  detailed  analysis  to  begin.  The  results 
of  the  analysis  to  date  are  reviewed  in  this 


article.  These  results  are  shaping  the 
approach  to  F-35  software  sustainment  in 
order  to  support  air  system  performance 
and  life-cycle  cost  savings  objectives. 

F-35  Program  Overview 

The  F-35  Air  System  consists  of  the  Air 
Vehicle  (AV),  including  the  propulsion  sys¬ 
tem  and  the  Autonomic  Logistics  (AL)  sys¬ 
tem.  The  AV  is  a  three-variant  family  of 
fifth-generation’  strikefighter  aircraft  con¬ 
sisting  of  the  F-35A  Conventional  Take-off 
and  Landing  (CTOL),  F-35B  Short  Take¬ 
off  Vertical  Landing  (STOVL),  and  F-35C 
Carrier  Variant  (CV). 

A  high  degree  of  designed-in  common¬ 
ality  will  exist  among  the  three  variants  (e.g., 
engines,  avionics,  crew  station,  subsystems, 
suspension,  and  release  equipment  and 
structure).  CTOL  operations  will  be  a  com¬ 
mon  capability  among  the  variants  with 
unique  capabilities  for  the  CV  (e.g.,  catapult 
and  arresting  gear  compatibility)  and 
STOVL  (e.g.,  vertical  launch  and  recovery, 
and  ski  jump  compatibility)  variants.  Key 


features  include  a  blend  of  supportable  low 
observable  technologies,  highly  integrated 
mission  systems,  interchangeable  propul¬ 
sion  systems,  interoperability  and  internal 
and  external  carriage  of  stores.  Other  exam¬ 
ples  of  commonality  include  Prognostics 
and  Health  Management  systems.  Institute 
of  Electrical  and  Electronics  Engineers 
(IEEE)  1394  aircraft  bus  design  [2],  and  a 
cockpit  incorporating  advanced  on-board 
and  off-board  sensor  fusion.  Each  variant 
will  provide  an  adverse  weather,  day/night 
capability  to  effectively  execute  operational 
missions. 

The  F-35  AV  operates  in  concert  with 
AL,  including  AL  Information  System 
(ALIS),  which  uses  prognostics  and  health 
information  from  the  AV  to  enable  proac¬ 
tive  maintenance.  AL  also  features  a  training 
system  which  is  concurrent  with  aircraft 
versions,  missions,  and  maintenance  tasks. 
And,  F-35  AV  and  ground  systems  are 
designed  to  interoperate  with  the  net-cen¬ 
tric  combat  and  logistics  environments 
required  for  modern  combat  operations. 


Figure  1:  Estimated  F-35  Software  Sustainment  Baseline 
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The  air  system  software  configuration 
present  at  the  end  of  the  system  develop¬ 
ment  and  demonstration  phase  of  the  pro¬ 
gram  forms  the  software  sustainment 
baseline.  This  baseline,  consisting  of  AV, 
AL  systems,  and  lab  software  is  currently 
estimated  to  be  approximately  20  million 
source  lines  of  code.  A  break-out  by  cate¬ 
gory  is  shown  in  Figure  1  (see  previous 
page).  Maintaining  maximum  commonali¬ 
ty  of  this  software  across  all  variants  and 
versions  is  key  to  achieving  program 
affordability  goals. 

Figure  2  overlays  the  relationship  of 
planned  F-35  program  phases  (top  of  fig¬ 
ure)  to  a  standard  product  life  cycle  (center), 
and  to  the  steps  involved  in  development 
and  delivery  of  product  support  (bottom  of 
figure),  through  the  end  of  program.  The 
JSF  Development  and  Low  Rate  Initial 
Production  (TRIP)  phases  of  the  program 
ensure  that  the  Air  System  and  its  support 
systems  are  mature  as  Full  Rate  Production 
(FRP)  and  Initial  Operation  Capability 
thresholds  are  reached.  Software  sustain¬ 
ment  plans  and  estimates,  for  the  In-Service 
portion  of  Figure  2,  extend  50  years. 

Performance-Based 

Contracting 

The  F-35  program  includes  parmer  partici¬ 
pation  by  the  U.S.  Air  Force,  Navy,  and 
Marine  Corps;  the  United  Kingdom;  Italy; 
the  Netherlands;  Turkey;  Canada;  Australia; 
Denmark;  and  Norway.  Additional  foreign 
military  sales  are  under  consideration. 
Warfighters  from  these  militaries  will  chan¬ 
nel  their  needs  through  the  JSF  Program 
Office  (JPO).  A  joint  agreement  on  F-35 
production,  sustainment  and  follow-on 
development  will  guide  the  evolution  of  the 
Air  System,  and  sets  ground  rules  for  part¬ 
ner  participation. 


The  JPO  is  the  single  point  of  contrac¬ 
tual  direction  from  warfighters  to  JSF  prin¬ 
cipal  partners,  and  to  propulsion  system 
contractors.  JSF  principal  parmers  include 
Lockheed  Martin  Aeronautics,  (the  Product 
Support  Integrator  [PSI]),  Northrop 
Grumman,  and  BAE  SYSTEMS.  The  PSI 
is,  in  turn,  responsible  for  managing  the  F- 
35  global  industrial  base  of  United  States 
and  international  suppliers  and  depots. 

The  performance-based  contracting 
model  can  be  visualized  as  a  loop,  which 
begins  and  ends  with  the  warfighters  and 
flows  through  the  JPO,  the  PSI,  and  the 
global  industrial  base.  Warfighters  express 
their  needed  capabilities  or  changes  to  the 
JPO,  along  with  their  required  performance 
levels  expressed  in  terms  of  mission  effec¬ 
tiveness,  aircraft  availability,  sorties  per 
month,  etc.  Performance-based  contracts 
from  JPO  then  transfer  the  risk  and  respon¬ 
sibility  to  provide  specified  performance 
levels  to  the  PSI  and  to  the  industrial  base. 
Metrics  quantify  air  system  performance 
and  incentives  or  penalties.  These  perfor¬ 
mance  metrics  are  the  basis  for  a  variable 
pricing  component,  referred  to  as  power  bj 
the  hour.  Payment,  under  performance  based 
contracting,  is  thus  based  on  usage  instead 
of  breakage.  Additionally,  price  improve¬ 
ment  targets/  curves  are  established  to  drive 
reduction  in  cost  over  the  term  of  a  con¬ 
tract.  In  the  event  that  design  changes  are 
implemented  to  improve  performance, 
resulting  cost  reductions  are  shared  between 
customers  (in  reduced  price),  and  industry, 
(in  increased  profit).  The  point  of  perfor¬ 
mance-based  contracts  is  encapsulated  in 
the  term  Performance  Based  Outcomes 
(PBO) .  Watfighters  are  now  contractingfor  an  out¬ 
come,  or  a  result,  as  opposed  to  contracting  for 
repairs,  replacements,  supplies,  inventory,  shipment, 
or  services. 


Managing  Software  in  a 

Performance-Based 

Environment 

Just  as  performance  of  the  F-35  Air  System 
is  predicated  on  software,  so  is  the  success 
of  performance-based  contracting.  Soft¬ 
ware  is  viewed  as  a  crucial  commodity 
among  many  that  must  be  managed  for  pre¬ 
dictability.  This  article  will  proceed  to  exam¬ 
ine  the  keys  to  successful  sustainment  of 
software  in  a  PBO  environment.  First,  how¬ 
ever,  the  PBO  software  sustainment  domain 
will  be  scoped  by  reviewing  boundaries,  def¬ 
initions,  and  success  criteria. 

As  an  entry  point  to  analysis  of  software 
sustainment,  a  boundary  graphic  (Table  1), 
was  produced  to  delineate  software  services 
funded  under  PBO,  as  opposed  to  that 
funded  otherwise.  To  summarize  the  graph¬ 
ic,  any  software  sustainment  action  taken  to 
maintain  the  delivered  software  baseline 
falls  within  PBO;  any  software  sustainment 
action  which  adds  or  changes  functionality 
to  the  software  baseline  falls  outside  of  PBO. 

These  boundaries,  however,  must  be 
accompanied  by  precise  definitions  of  soft¬ 
ware  changes,  releases,  and  support  services. 
Software  changes  are  defined  in  terms  of 
priority  (routine,  urgent,  and  emergency)  and 
purpose  (corrective,  adaptive,  perfective,  new 
capability,  performance  initiative,  technolo¬ 
gy  sustainment,  or  technology  insertion). 
Software  release  types  are  defined  in  terms 
of  size  and  tempo  (such  as  major  or  minor 
block  releases,  AV  maintenance  updates,  or 
asynchronous  releases  of  AL  software). 
Typical  timespans  for  development  are 
associated  with  software  change  and  release 
categories.  These  values  form  the  basis  for 
sustainment  cost  models  and  a  single,  inte¬ 
grated  master  sustainment  plan. 

Measurement  and  analysis  of  F-35  soft¬ 
ware  performance  will  be  driven  by  perfor¬ 
mance-based  metrics  derived  from 
Performance-Based  Agreements  between 
the  JPO  and  the  warfighters.  The  top-level 
metrics  are  decomposed  into  a  metrics  tax¬ 
onomy.  This  metrics  taxonomy  encompass¬ 
es  all  influences  upon  the  top-level  PBO 
metrics.  Accordingly,  software  performance 
is  measured  and  analyzed  to  quantify  its 
operational  performance.  The  lower-tier 
software  metrics  reveal  the  influence  of 
software  on  F-35  air  system  PBO  metrics 
and  serve  to  initiate  improvements  in  per¬ 
formance  and  predictability.  Table  2  pro¬ 
vides  a  look  at  how  software  might  influ¬ 
ence  air  system  performance  and  how  such 
influences  will  be  tracked. 

Keys  to  Successful  PBO 
Sustainment  of  F-35  Software 

As  stated  earlier,  the  ramifications  of  per- 
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formance-based  contracting  to  F-35  soft¬ 
ware  are  far-reaching.  Looking  forward 
through  2063,  factoring  in  the  rate  of  tech¬ 
nological  change  and  considering  security 
and  safety  ramifications,  sustainment  of  F- 
35  software  quickly  moves  from  far-reaching 
to  prodigious.  As  such,  the  following  eight  key 
steps  are  being  taken  to  manage  this  com¬ 
modity. 

/.  Strive  for  Commonality 

The  JSF  program,  from  its  inception,  has 
been  built  upon  the  following  four  pillars: 
affordability,  lethality,  survivability,  and 
supportabiHty.  The  extent  to  which  a  com¬ 
mon  software  baseline  is  retained  across  F- 
35  variants  and  F-35  international  parmers 
win  directiy  affect  overall  affordability  and 
supportability.  While  air  system  software  is 
tailorable  and  compatible  with  each  own¬ 
ing  service’s  support  environments,  con¬ 
tinued  emphasis  on  commonality  wiU  max¬ 
imize  affordability  and  supportability  for 
aU  system  users.  A  common  solution, 
employing  minimal  infrastructure,  pro¬ 
vides  best  value  sustainment  capability  at 
minimum  cost  to  aU  parties. 

The  ultimate  goal  of  all  participants, 
therefore,  is  to  reach  consensus  on  a  com¬ 
mon  sustainment  solution  and,  thereby, 
minimize  the  incidence  of  multiple  sys¬ 
tem/  software  configurations.  Flowever, 
some  unique  capabilities  wiU  be  necessary 
to  satisfy  specific  operational  needs, 
address  sovereign  requirements,  and  aUevi- 
ate  political  and  industrial  concerns. 
Unique  software  capabiUties  wiU  typicaUy 


Covered  by  PBO  Contract 

Covered  by  Follow-On 
Development  Contract 

•  Maintenance 

•  New  Capability 

-  Corrective  Maintenance 

•  Technology  Insertion 

-  Urgent  and  Emergency  Corrections 

•  Adaptive  Change 

•  Performance  initiatives 

•  Perfective  Change 

-  Operationai  Administration 

•  Urgent  Operational  Need 

-  Day-to-Day  Administrative  Support 

•  Emergency  Operational  Need 

•  Sustaining  Engineering 

-  Pianning 

-  Studies 

-  Standing  Boards,  Configuration 
Management 

-  Needs  Anaiysis 

-  Lab  and  Deveiopment 

-  Environment/Infrastructure 

-  Programming  Infrastructure 

-  Field  Deficiency  Assessment 

•  Technology  Sustainment 

•  Production  Retrofit 

Table  1:  Peiformance-Tiased  Contracting  Houndary 


occur  in  two  areas:  in  the  AV  Mission 
Systems  software  configuration  (specifical¬ 
ly,  in  weapons  controls,  pilot-vehicle  inter¬ 
face,  and  communications/interoperabiU- 
ty),  or  as  additions  to  the  F-35  software 
integration  and  test  supporting  infrastruc¬ 
ture.  In  either  event,  F-35  partner  coun¬ 
tries  wiU  have  the  opportunity  to  have  their 
changes  considered  for  inclusion  in  the 
common  baseUne  before  steps  are  taken  to 
assess  the  cost  and  impacts  of  a  unique 
software  change.  Any  unique  functionaUty 
wiU  be  encapsulated  to  minimize  re-verifi¬ 
cation  expense  and  the  cost  wiU  be  borne 
by  the  partner/partners  involved  on  a  paj- 
to-be-different  basis. 


2.  Apply  Industrial  Engineering 
Practices  to  Software 

Many  parameters  must  be  considered  to 
plan  and  manage  the  performance-based  F- 
35  software  sustainment  domain  with  a 
globaUy  dispersed,  multinational  future 
fleet.  These  parameters  include  the  frequen¬ 
cy  and  quantity  of  software  changes,  the 
number  of  versions,  and  the  time  required 
for  development,  validation,  and  distribu¬ 
tion.  AV  load  times  are  important,  even 
more  so  when  viewed  over  a  50-year  period. 
In  this  context,  F-35  software  sustainment 
emerges  as  an  industrial  engineering  field, 
where  efficiency,  consistency,  the  ehmina- 


Table  2:  Software  Influence  on  Performance-Based  Metrics 


Performance 

Criteria 

Top-Level 

PBO  Metrics 

Software 

Metrics 

Readiness/ 

•  Aircraft  Availability 

•  Aircraft  Downtime  (Software) 

Availability 

•  Mission  Capable  Aircraft  Availability  (AA)  Rate 

•  Software  Mission  Capability 

Mission 

•  Mission  Effectiveness 

•  Primary  Task  Not  Accomplished  (Software) 

Effectiveness 

-  Primary  Task  Not  Accomplished;  System  Condition 

-  Secondary  Task  Not  Accomplished;  System  Condition 

•  Secondary  Task  Not  Accomplished  (Software) 

Required  Sorties 

•  Percent  Sorties  Flown 

•  Percent  Sorties  Not  Flown  Due  to  Software 

and  Flying  Hours 
Accomplished 

•  Percent  Flying  Hours  Flown 
(*  Flown  ‘Ground  Abort  ‘Cancelled) 

•  Percent  Flying  Hours  Not  Flown  Due  to  Software 

Logistics 

•  Logistics  Footprint  Data 

•  Support  Equipment  Change  Due  to  Software 

Footprint 

Total  Change  = 

Support  Equipment  Change  + 

Personnel  Change 

-  Support  Equipment  Size  Change  Due  to  Software 

-  Support  Equipment  Quantity  Change  Due  to  Software 
•  Personnel  Change  Due  to  Software 

-  Direct  Manpower  Change  Due  to  Software 

-  Other  Manpower  Change  Due  to  Software 

Military  Level 

•  Cannibalizations  per  1,000  Flight  Hours  (FH) 

•  No  Software  Metric  Applicable 

of  Effort 

•  Maintenance  Man-Hours  per  FH 

•  Maintenance  Man-Hours  per  FH  (A/C  Subsystem) 

•  Software  Maintenance  Man-Hours  per  Flight  Hour 

•  Software  Maintenance  Man-Hours  per  Flight  Hour 
(A/C  Subsystem) 
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tion  of  waste,  and  a  soKd  understanding  of 
capacity  must  be  achieved.  To  help  bring 
order  and  quantification  to  the  F-35  soft¬ 
ware  industry,  modeling  and  measurement 
are  being  employed. 

An  end-to-end  software  sustainment 
process  model  supports  the  business  case 
analysis  of  F-35  software  sustainment.  The 
model  is  expressed  as  an  event-driven 
process  model  using  the  Architecture  of 
Integrated  Information  Systems  modeling 
tool  (ARIS).  ARIS  was  selected  based  on  its 
features,  its  standardi2ation  within 
Lockheed  Martin,  and  its  use  in  develop¬ 
ment  of  interfacing  models  (F-35  AL  oper¬ 
ation  guides,  software  integration  and  test 
lab  modeling,  system  budd,  software  load¬ 
ing  processes,  and  software  distribution). 
The  software  sustainment  model  starts  with 
the  field  report  of  a  software  defect  and 
ends  with  measurement  of  performance  of 
the  delivered,  operational  software  solution. 
The  model  provides  a  means  to  understand 
tasks  and  capacity  constraints,  and  supports 
estimation  of  sustainment  costs. 

An  example  analysis  area  within  the 
model  is  the  process  of  system  build. 
System  build  is  the  packaging  of  hundreds 
of  lower-tier  software  products  to  create  a 
release  product  set.  A  complete  air  system 
build  package  is  a  single,  deliverable  soft¬ 
ware  product  (end  item)  to  the  fleet,  for 
installation  and  operation.  It  is  an  organized 
assembly  of  vehicle  system,  mission  system, 
and  AL  software,  along  with  release  docu¬ 
mentation,  flight  clearance,  technical  data, 
and  other  supporting  version  information 
to  facilitate  identification  and  distribution. 
System  build  is  a  process  that  will  be  per¬ 
formed  many  times  during  each  mainte¬ 
nance  release  cycle  and  is  critical  to  both  the 
delivery  timeline  and  overall  load  integrity. 

Tight  controls  are  applied  to  this  crucial 
handover  point.  The  system  build/ final  soft¬ 
ware  integration  process  is  dependent  upon 
safeguards  and  controls  imposed  upon 


Table  3:  Software  Maintenance  ^benchmarking 
Participants 


Benchmarking  Projects 

British  AV-8B 

U.S.  AV-8B 

A-10 

B-1B 

B-2 

C-17 

C-130 

JE-3 

F-15 

F-16 

F-18 

F-22 

F-117 

P-3C 

lower-tier  software  builds.  AH  required  cer¬ 
tifications,  qualifications,  formats,  and 
approvals  must  be  applied  throughout  the 
software  build  hierarchy.  Accordingly,  three 
checkpoints/readiness  gates  were  estab¬ 
lished  to  ensure  that  files  and  artifacts 
obtained  for  system  build  are  completed, 
correctiy  formatted,  fuUy  described,  and 
duly  authorized.  The  gates  affirm  that  aU 
required  software  components,  along  with 
related  files  or  data,  are  available  and  have 
been  properly  identified,  are  functionally 
acceptable,  have  achieved  aU  required  certi¬ 
fications  and  qualifications,  and  that  inter¬ 
faces  comply  with  applicable  requirements 
and  descriptions.  Transfer  of  files  and  arti¬ 
facts  through  these  gates  are  controlled 
with  checklists,  which  specify  criteria  and 
are  administered  through  a  review  and 
approval  process.  To  the  extent  possible, 
additional  safeguards  have  been  incorporat¬ 
ed  in  tools  and  workflows. 

Apart  from  measuring  the  integrity  of  the 
system  build  process,  a  set  of  metrics  is 
maintained  to  enable  capacity  planning  for 
system  build.  As  releases  are  produced,  span 
times,  touch  times,  execution  times,  delay 
times,  and  total  effort  metrics  are  tracked 
for  routine  and  urgent  builds.  The  results 
are  synthesized  into  cost  estimates  and 
process  improvement  initiatives. 

3.  Engage  Customers 

The  F-35  software  life  cycle  was  planned  in 
progressive  stages.  Each  stage  engaged 
users  and  partner  country  subject  matter 
experts.  First,  U.S.,  U.K.,  and  international 
standards  for  software  maintenance  and 
support,  including  IEEE,  Society  of 
Automotive  Engineers,  and  military  stan¬ 
dards  were  canvassed  to  form  a  foundation 
for  sustainment  planning. 

Next,  fact-finding  was  accomplished 
through  a  benchmarking  study  of  software 
maintenance  operations  across  14  military 
aircraft  programs.  A  list  of  software  mainte¬ 
nance  operations  willing  to  share  their 
expertise  is  shown  in  Table  3. 

A  questionnaire  was  developed  by  F-35 
Integrated  Product  Teams  (IPTs).  The 
benchmarking  study  manager  used  the 
questionnaire  to  conduct  interviews  with  39 
representatives  from  the  14  software  main¬ 
tenance  programs.  The  information 
obtained  from  these  interviews  was  com¬ 
piled  and  summarized  in  a  report.  The  study 
served  to  identify  practices  which  required 
consideration  for  adoption,  or  avoidance,  by 
the  F-35  program.  It  affirmed  the  impor¬ 
tance  of  thorough  planning,  establishment 
of  communication  channels  and  informa¬ 
tion  flows,  and  compliance  with  clearly  doc¬ 
umented  processes.  The  study  also  revealed 
that  successful  support  of  multinational 


customers  requires  focused  attention  on 
several  elements  (e.g.,  export  controls,  soft¬ 
ware  storage  and  segregation,  and  joint 
acceptance  criteria  for  software  changes). 

Following  the  benchmarking  study, 
results  were  incorporated  into  the  software 
life-cycle  plan  for  the  F-35.  The  plan  was 
subject  to  several  rounds  of  review  by  gray- 
beard  panels.  Each  round  of  review  was 
conducted  during  a  30-day  timeframe, 
beginning  with  a  kickoff,  followed  by  indi¬ 
vidual  preparation.  Panelists  invested  an 
average  of  8.5  hours  to  study  the  plan,  rate 
the  contents,  and  prepare  their  preliminary 
comments.  Deep-dive  assignments  were 
allocated  to  focus  specific  panelists  on 
selected  topics.  Panels  were  then  convened 
for  a  face-to-face  walk-through  of  topics 
over  the  course  of  several  days,  and  results 
were  summarized  in  outbriefs.  More  than 
1,000  comments  and  recommendations 
were  raised  and  addressed  as  the  result  of 
the  graybeard  panels.  (Responses  to  several 
of  these  recommendations  are  noted 
throughout  this  article.) 

Interactions  with  graybeard  panelists 
opened  the  door  for  visits  to  seven  U.S  and 
international  software  maintenance  opera¬ 
tions  by  F-35  representatives.  These  visits 
resulted  in  useful  dialogue  with  customers 
and  software  sustainment  personnel  from 
11  aircraft  programs.  Again,  a  standard  list 
of  topics  and  questions  were  used  to  ensure 
consistency. 

Finally,  with  a  reasonably  mature  life- 
cycle  plan  in  place,  subject  matter  experts 
from  partner  countries  were  directly 
engaged  with  JSF  contractors  in  a  Software 
Maintenance  and  Sustainment  Working 
Group  (SMS  WG),  a  team  of  40  partici¬ 
pants  comprised  of  equal  part  contractors 
and  customers.  The  SMS  WG  is  chartered 
to  ensure  customer  expectations  relative  to 
JSF  software  sustainment  are  considered  in 
communicating,  planning,  documenting, 
contracting,  and  scheduling  of  affordable 
software  sustainment  solutions. 

4.  Adopt  a  Holistic  Approach  to 
Sustainment 

Earlier  in  this  article,  the  contractual 
boundaries  between  PBO  maintenance 
changes  and  follow-on  development 
changes  were  emphasized.  While  this  is 
keenly  important  from  a  funding  stand¬ 
point,  PBO-driven  software  changes  can¬ 
not  be  viewed  independendy.  PBO-driven 
software  changes  must  be  weighed  in  the 
context  of  performance  of  the  global  F-35 
fleet  and  balanced  for  their  impact  on 
overall  system  change  capacity  of  the  F-35 
enterprise.  Ultimately,  the  effect  of  any 
software  change  wiU  be  evaluated  in  terms 
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of  net  worth  provided  to  the  warfighter. 

F-35  block  updates  will  bundle  the  deliv¬ 
ery  of  new  functionality  and  PBO  mainte¬ 
nance  changes.  Block  updates  will  include 
a  mix  of  aU  types  of  software  changes,  and 
may  encompass  hardware/ subsystem 
changes.  As  IPTs  develop  and  produce  the 
changes  to  support  F-35  block  plans, 
capacity  planning  must  allow  for  both  PBO 
changes  and  for  new  functionality. 
Accordingly,  the  processes,  definitions, 
requirements  and  practices  for  software 
maintenance  planning  were  merged  with  the 
F-35  template  for  software  development 
plans. 

On  the  plus  side,  technical  solutions  are 
not  constrained.  System  changes,  hardware 
changes  and  software  changes,  corrections, 
and  new  functionality  are  all  assessed  with 
respect  to  the  end  effect  on  air  system  per¬ 
formance  and  affordability. 

On  the  downside,  cost  accounting  with¬ 
in  IPTs  responsible  for  producing  correc¬ 
tive  changes  and  new  functionality  requires 
extreme  fidelity  to  ensure  PBO  effort  is  dis¬ 
tinguished  from  foUow-on  development. 

5.  Develop  Highly  Maintainable 
Systems  and  Software 

Maintainability  of  F-35  software  is  based 
upon  an  AV  with  an  open  and  scalable 
architecture.  The  oJ)en  architecture  allows  for 
expansion  with  minimal  impact  to 
unchanged  elements,  through  use  of  well- 
defined,  non-proprietary  interfaces  and  pro¬ 
tocols.  Flardware  and  software  elements  are 
partitioned  using  loosely  coupled,  non-time 
critical  interfaces.  A  data  collection  domain, 
added  to  the  AV  architecture,  supports  gen¬ 
eral  instrumentation  and  fault  isolation 
requirements.  An  isolation  layer  protects  the 
software  investment  from  hardware  obso¬ 
lescence  and  facilitates  multi-use  across  air 
system  domains. 

DOORS  (Dynamic  Object  Oriented 
Requirements  System)  databases  are  popu¬ 
lated  with  requirements  and  the  rationale 
for  their  selection,  including  linkages  to  the 
architectural  models.  Modeling  and  simula¬ 
tion  tools  based  on  architectural  constructs 
are  employed  to  develop  and  validate  the 
requirements  and  verify  the  air  system. 

Object-oriented  design  results  in  smaller 
configuration  items  with  clearly  defined 
functionalities.  Hardware  independence  sup¬ 
ports  problem  accountability  and  locali2es 
change.  This  approach  results  in  minimal 
changes  to  the  overall  configuration  when 
adding  or  deleting  functional  capability.  It 
also  supports  change  development  at  the 
lowest  level,  minimizes  the  impact  of 
changes,  and  enables  focused  testing.  Focused, 
model-based  component  testing  provides  oppor¬ 


tunities  for  efficiency  in  an  area  that  typical¬ 
ly  entails  high  cost. 

Architectural  and  design  integrity  is 
maintained  through  use  of  structured,  com¬ 
mon  systems  engineering  processes  and  tools, 
which  have  reduced  initial  development 
costs  and  will  support  the  efficient  long¬ 
term  maintenance  of  designs.  The  JSF 
Systems/Software  Engineering  Environ¬ 
ment  (S/SEE)  resides  on  networks  of 
computing  equipment  which  connect  F-35 
customers,  contractors  and  subcontrac¬ 
tors.  This  shared  environment  is  used 
across  the  entire  team  to  foster  a  unified 
understanding  of  the  open  architecture 
and  its  maintenance.  The  S/SEE  includes 
commercial  off-the-shelf  (COTS)  Unified 
Modeling  Language  tools,  such  as 
Rhapsody,  for  enforcing  the  object-orient¬ 
ed  software  design  and  the  Signal  Interface 
Management  System^  (SIMS)  tool,  which 
forces  full  interface  definition. 

Autocoding  With  S/SEE  tools  allows 
software  code  to  be  developed  to  Open 
System  Architecture  (OSA)  standards.  In 
specific  domains,  design  tools  are  used  to 
capture  requirements,  model  a  functional 
algorithm  and  provide  source  code  as  an 
output  to  implement  the  modeled  design. 

Emphasis  is  placed  on  leveraging 
COTS,  and  supportability  (aggressive  prepara¬ 
tion  for  tool  obsolescence).  Configurations 
of  S/SEE  tools  are  managed  in  consonance 
with  air  system  software  releases  to  ensure 
build  repeatability  and  maintainability. 

Maintainability  of  F-35  software  is  also 
supplemented  by  software  reuse  and  multi-use. 
(Reuse  is  the  use  of  pre-existing  code; 
multi-use  is  the  reuse  of  code  in  multiple 
areas.)  Software  with  proven  maturity  and 
reliability,  from  various  sources  including 
legacy  aircraft,  government-furnished 
equipment,  software  and  databases,  differ¬ 
ent  F-35  variants,  off-the-shelf  software, 
subsystem  software,  and  third  party  suppli¬ 
ers,  is  reused  to  the  extent  possible.  Reuse 
objects  are  also  used  as  starting  points  for 
development  of  other,  closely  related 
objects  that  may  be  required  for  different  F- 
35  domains.  Availability  and  quality  of  doc¬ 
umentation  and  source  files  are  considered 
as  part  of  reuse  analysis  and  determination. 
Preconditioning  of  software  targeted  for  reuse 
may  be  performed  as  necessary  in  order  to 
ensure  maintainability.  If  candidate  code 
does  not  meet  OSA  standards  or  object  ori¬ 
entation,  but  provides  needed  functionality 
with  an  effective  design,  the  design  and 
algorithms  (only)  are  reused  and  recoding  is 
performed  using  architectural  patterns  to 
develop  software  which  provides  better 
long-term  maintainability  and  lower  total 
ownership  cost  than  pre-existing  software 
that  is  wrapped  to  be  OSA  compliant. 


6.  Manage  Off-the-Shelf  Software 

The  up-front  savings  realized  through  use 
of  off-the-shelf  software  is  frequendy  off¬ 
set  by  risk  and  expense  incurred  later  in 
the  product  life  cycle.  (Off-the-shelf  soft¬ 
ware  includes  COTS,  modified  off-the- 
shelf  [MOTS],  government  off-the-shelf 
[GOTS],  freeware,  public  software,  and 
related  categories  of  non-developed  soft¬ 
ware.)  Accordingly,  special  emphasis  was 
placed  on  managing  off-the-shelf  soft¬ 
ware.  Through  a  collaborative  effort  with 
the  JPO,  a  process  document  entided, 
“Off-The-Shelf  Software  in  the  JSF 
Software  Dfecycle”  [3]  was  produced  and 
deployed.  The  process  document  describes 
actions,  over  and  above  standard  software 
process  requirements  applicable  to  software 
developed  specifically  for  the  JSF  Program, 
which  must  be  taken  to  ensure  off-the-shelf 
software  components  are  configuration 
managed  throughout  their  life  cycle.  It  also 
identifies  the  functions/roles  responsible 
for  those  actions.  Instructions  in  the 
process  are  partitioned  according  to  their 
applicability  to  a  4-phase  life  cycle,  along 
with  generally  applicable  rules,  guidelines, 
and  warnings. 

Based  on  results  of  a  2005  collaborative 
evaluation  of  process  implementation, 
“Off-The-Shelf  Software  in  the  JSF 
Software  Lifecycle”  was  revised  and  updat¬ 
ed  to  include  a  system  for  classification  of 
off-the-shelf  software  projects  (small,  medi¬ 
um,  or  large  projects)  based  on  specified  cri¬ 
teria.  Suggested  tailoring  of  process  require¬ 
ments  based  on  project  category  is  included 
along  with  examples.  The  revised  process 
also  incorporates  a  requirement  for  a  gener¬ 
ation  of  a  compliance  matrix  by  off-the- 
shelf  software  projects,  to  ensure  that  all 
applicable  requirements,  including  license 
and  distribution  controls,  are  adequately 
addressed. 

7.  Plan  for  the  Unexpected 

Warfighters  are  keenly  interested  in  how  the 
F-35  AL  global  sustainment  solution  will 
respond  to  urgent  operational  requests,  or 
to  emergencies.  To  answer  these  concerns, 
software  sustainment  scenarios  have  been 
developed  by  the  F-35  SMS  WG.  The  sce¬ 
narios  contain  sufficient  detail  to  describe 
the  activities  required  for  non-routine  situa¬ 
tions.  The  scenarios  are  used  to  exercise 
ARIS  process  models  and  make  an  up-front 
determination  of  the  cost  and  time  required 
to  perform  aU  needed  activities. 

8.  Analyze  and  Refine  the  Software 
Sustainment  Business  Case 

The  eight  steps  featured  in  this  article  have 
covered  a  lot  of  territory.  But  the  steps 


December  2007 


www.stsc.hill.af.mil  I  3 


Software  Sustainment 


begin  and  end  with  a  focus  on  money.  Step 
1  focused  on  achieving  affordability 
through  commonality.  This  final,  8th  step 
addresses  the  business  case  analysis  of  F-35 
sustainment,  annual  global  sustainment 
total  ownership  cost  estimates,  and  the 
software  cost  estimation  practices  that  sup¬ 
port  these  analyses.  Each  of  these  intercon¬ 
nected  activities  uses  a  spiral  development 
approach  with  each  spiral  providing 
increased  fidelity  of  data,  inclusion  of  deci¬ 
sions  from  across  the  program,  updates  on 
configurations  and  reliability  projections, 
and  comprehensive  detailing  of  the  busi¬ 
ness  offering. 

Business  Case  Analyses  (BCAs)  define 
F-35  global  sustainment  policies  and 
processes.  These  analyses  answer  three 
basic  questions:  1)  What  individual  tasks 
must  be  performed  during  sustainment?, 
(2)  who  (government  or  contractor)  should 
perform  those  tasks,  based  on  best  value?, 
and  (3)  does  the  task  allocation  support 
established  performance  standards  and 
provide  sufficient  savings?  BCAs  address, 
for  example,  a  pricing  architecture  for  PBO 
sustainment,  international  taxes  and  tariffs, 
industrial  base  capacity  and  responsiveness, 
and,  of  course,  software  sustainment. 

Global  sustainment  cost  estimates  are 
formulated  annually.  These  estimates  inte¬ 
grate  estimates  from  all  IPTs,  functions, 
team  member  companies,  and  subcontrac¬ 
tors  involved  in  the  F-35  global  sustain¬ 
ment  solution  and  the  pilot  program  for 
performance-based  sustainment.  Annual 
cost  estimates  are  consistently  produced, 
fact-based,  and  supportable.  They  are  rec¬ 
onciled  with  the  JPO  affordability  cost  ana¬ 
lysts  and  are  finalized  and  formally 
approved  at  JSF  cost  summit  events.  The 
integrity  of  these  annual  global  sustainment 
cost  estimates  is  critical  to  the  success  of 
affordable,  PBO  sustainment  of  the  F-35 
fleet. 

Parametric  software  sustainment  cost 
estimates  are  developed  for  inclusion  in  the 
annual  global  sustainment  estimates  using 
output  from  the  System  Evaluation  and 
Estimation  of  Resources  -  Software 
Estimating  Model  (SEER-SEM)  tool. 
Software  sustainment  cost  estimates  align 
with  Cost  Analysis  Improvement  Group 
Element  6.5,  “Software  Maintenance 
Support.”  They  are  not,  however,  fuUy  rep¬ 
resentative  of  aU  costs  associated  with  per¬ 
formance-based  software  sustainment  and 
are  subject  to  ongoing  refinement  and 
updates.  Updates  provide  greater  detail  and 
direct  estimates  for  software  integration 
and  test  activities,  software  lab  keep  warm 
costs,  and  greater  fidelity  with  respect  to 
license  costs  for  off-the-shelf  software 
included  in  deliverable  F-35  software  prod¬ 


ucts.  In  the  absence  of  actual  data,  ground 
rules  and  assumptions  are  documented  and 
version-controlled  to  describe  cost  areas 
which  are  included  in,  or  excluded  from, 
the  F-35  software  sustainment  business 
case. 

Finally,  long-term  software  sustainment 
cost  estimates  entail  software  maintenance 
and  software  growth  estimates.  Once  a  soft¬ 
ware  release  (which,  as  we  have  seen,  will 
contain  fixes  and  new  functionality)  is  dis¬ 
tributed  to  the  field,  it  becomes  the  new 
maintenance  baseline  and  PBO  contracting  for 
when  the  new  release  takes  effect. 

Conclusion 

The  decision  to  apply  a  performance-based 
sustainment  approach  to  the  F-35  has 
caused  fundamental  changes  in  the 
approach  to  Air  System  sustainment. 
Traditional  roles  and  responsibilities  are 
shifting.  An  increased  risk  is  transferred  to 
contractors  who  are  now  responsible  for 
system  availability  and  mission  success. 
This  has  precipitated  a  new  approach  to 
software  sustainment.  While  results  are 
years  away,  the  F-35  software  community 


has  put  a  foundation  in  place  for  PBO  soft¬ 
ware  sustainment.  Construction  on  that 

foundation  continues,  day  by  day.^ 
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Notes 

1.  Fifth  Generation  Fighter  features  these 
attributes:  advanced  stealth,  information 
fusion,  high  agility,  enhanced  situational 
awareness,  new  levels  of  reliability  and 
maintainability,  and  network-enabled 
operations. 

2.  SIMS  is  a  Lockheed  Martin  Aero  inter¬ 
nally  developed  interface  management 
tool,  based  on  a  commercial  relational 
database  and  used  on  multiple  aircraft 
platforms. 
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This  article  presents  a  set  of  reference  metrics for  measuring  the  quality  of  services  in  Service-Oriented  Architectures  fSOAsJ. 

It  introduces  the  metrics  cube  and  scalability  curve  and  applies  them  to  the  development  of  service-level  agreements  (STAs) 
and  capacity  planning.  This  article  also  discusses  the  challenges  and  approaches  for  defining  and  allocating  end-to-end  metrics 
in  a  net-centric  environment  such  as  the  Global  Information  Grid  (GIG). 


In  an  SOA,  a  set  of  loosely  coupled 
services  work  together  over  a  network 
to  provide  functionalities  to  end-users 
[1],  The  service  provider  registers  infor¬ 
mation  about  a  service  at  a  service  reg¬ 
istry.  Service  consumers  can  find  the  ser¬ 
vice  from  the  registry  and  then  invoke 
the  service  through  the  service  interface. 

For  the  Department  of  Defense 
(DoD),  a  set  of  GIG  Enterprise  Services 
will  provide  warfighting,  business,  and 
intelligence  capabilities  to  support  opera¬ 
tional  missions  conducted  by  various 
communities  of  interest  [2].  Examples 
include  Net-Centric  Enterprise  Services 
(NCES)  [3]  and  Net-Enabled  Command 
Capabihty  (NECC)  [4]. 

Services  in  an  SOA  have  well-defined 
service  interfaces.  They  also  have  SLAs 
which  are  parts  of  the  service  contracts 
that  specify  the  levels  of  service  expect¬ 
ed  after  deployment.  A  key  aspect  of  an 
SLA  is  the  set  of  metrics  for  measuring 
performance  and  quality  of  service.  This 
article  develops  an  overarching  model  of 
reference  metrics  relevant  to  end-user 
experience.  It  introduces  the  concept  of 
a  metrics  cube  that  captures  the  relation¬ 
ship  between  the  metrics  which  are  then 
applied  to  the  development  of  SLAs  and 
capacity  planning. 

The  reference  metrics  are  important 
to  the  successful  implementation, 
deployment  and  sustainment  of  SOA  in 
the  GIG  because  of  the  following: 

•  They  form  the  basis  for  combining 
metrics  across  network  and  comput¬ 
ing  infrastructures  for  services  in  the 
GIG  net-centric  environment.  Since 
those  infrastructures  typically  fall 
under  various  responsible  entities, 
having  a  basic  reference  set  is  critical 
to  the  development  of  end-to-end 
metrics  relevant  to  an  end-user’s 
experience. 

•  They  relate  directly  to  consumer’s 
(end-user)  experience  using  a  service. 
This  includes  timeliness,  scalability, 
availability,  and  reliability,  which  are 
specified  in  SLAs. 


•  They  are  used  throughout  a  system 
engineering  life  cycle,  including 
requirement  definition,  SLA  devel¬ 
opment,  service  design,  performance 
testing,  and  SOA  sustainment. 

Reference  Metrics 

The  reference  metrics  are  collectively 
referred  to  by  their  symbols  as  the 
TSAR  (service  Time,  Scalability, 
Availability,  Reliability)  metrics.  They  are 
defined  in  more  detail  in  Table  1 . 

For  synchronous  services,  such  as  a 
request/ response  Web  service,  T  is  sim¬ 
ply  the  response  time.  It  is  measured 
from  the  time  a  consumer  sends  a 

*^The  TSAR  metrics  relate 
directly  to  consumer 
(end-user)  experience 
with  a  service  by 
answering  the  following 
questions:  how  fast  (T), 
how  muchlmany  (S),  how 
durable  (A),  and  how 
reliable  (R)/* 


request  to  when  the  consumer  receives  a 
response.  Typically,  T  will  have  an  aver¬ 
age  and  a  standard  deviation.  It  is  the 
sum  of  network  latency  (including  trans¬ 
mission  time,  propagation  time,  Internet 
protocol  delay,  and  congestion)  and  time 
spent  at  the  service  provider  (including 
local  processing  time  and  back-end  pro¬ 
cessing  time). 

For  asynchronous  services,  such  as  a 
messaging  service,  T  is  the  delivery  time. 
It  is  measured  from  when  a  publisher 
sends  a  message  to  when  subscribers 
receive  it.  T  is  typically  a  distribution 
with  T_min,  T_average,  and  T_max.  A 


service-level  agreement  may  guarantee 
delivery  within  a  certain  T_max. 

Scalability  (S)  measures  a  service’s 
ability  to  handle  growing  amounts  of 
work  within  the  desired  time  and  relia¬ 
bility  ranges.  Examples  are  user  load 
(number  of  users  within  a  certain  time 
span),  number  of  requests  per  unit  time, 
and  size  of  requests  or  messages  over  a 
certain  time. 

Availability  (A)  is  defined  as  one 
minus  the  percentage  of  planned  and 
unplanned  service  down  time.  In  other 
words,  it  is  the  combined  probability 
that  a  service  is  up  and  running.  It  is 
often  expressed  as  a  number  of  nines, 
such  as  99.9  percent  (8.8  hours 
down/year).  Contribution  for  A  comes 
from  planned  hardware/ software  main¬ 
tenance,  hardware  failure  of  networks 
and  processors,  and  software  failure  due 
to  fatal  defects  (e.g.  memory  leaks). 

Finally,  Reliability  (R)  is  the  percent¬ 
age  of  service  completion  with  antici¬ 
pated  results  when  the  service  is  avail¬ 
able.  Hence  if  R  =  95  percent,  the  error 
rate  is  5  percent.  Errors  generally  come 
from  non-fatal  software  defects, 
requests  rejected  by  load  control  mecha¬ 
nism  (when  availability  is  within  the 
required  range),  or  message  loss  during 
delivery  (e.g.,  due  to  congestion  or  faulty 
network  hardware).  Unexpected  results 
caused  by  problems  in  back-end  pro¬ 
cessing  (e.g.,  time  out  or  failure  of 
dependent  services)  are  also  considered 
errors.  Note  that  reliability  is  defined  at 
the  application  level.  It  measures  how 
reliable  a  service  performs  its  function 
when  it  is  up  and  running. 

The  TSAR  metrics  relate  directly  to 
consumer  (end-user)  experience  with  a 
service  by  answering  the  following  ques¬ 
tions:  how  fast  (T),  how  much/ many  (S), 
how  durable  (A),  and  how  reliable  (R). 

Metrics  Cube 

The  TSAR  metrics  are  not  all  indepen¬ 
dent.  In  general,  as  S  increases,  T  goes 
up,  and  R  goes  down.  The  plot  of  T  or 
R  versus  S  is  called  a  scalabilif  curve. 
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Metrics 

Symbols 

Notes 

Service  time 

T 

Response  time  for  synchronous  services.  Delivery  time 
for  asynchronous  services. 

Scaiability 

S 

Examples  are  user  load  and  number  of  requests  per 
second. 

Avaiiability 

A 

It  includes  planned  maintenance  and  unplanned  down 
time. 

Reliability 

R 

Due  to  defects,  rejected  requests,  message  loss,  etc. 

Table  1:  Keference  Metrics  for  SO  As 


Service  Time 


s 


Figure  1:  Scalabilif  Curves 

Figure  1  shows  an  example  based  on  a 
model  with  a  finite  service  queue  [5],  As 
S  increases,  incoming  requests/ messages 
spend  more  time  waiting  in  the  queue, 
causing  T  to  increase.  Network  latency  is 
not  included  in  this  example.  Also,  in 
Figure  1,  R  includes  the  effect  of  both 
software  defects  and  rejected  requests 
(the  latter  happens  when  the  number  of 
requests  in  the  queue  reaches  an  upper 
limit).  The  exact  values  of  the  curves  in 
Figure  1  are  not  important  for  the  dis¬ 


Reliability 


s 


cussion  here.  An  appendix  in  the  online 
version  of  this  article  provides  further 
details  about  the  model. 

In  Figure  1,  S  is  the  rate  of  service 
requests  relative  to  the  maximal 
throughput  )J,.  Here  throughput  is 
defined  as  the  number  of  completed 
requests  per  unit  time.  As  S  increases 
from  zero,  the  throughput  increases  lin¬ 
early  with  S.  However,  as  S  approaches 
one  (the  rate  of  service  requests 
approaching  the  maximal  throughput). 


Figure  2:  Metrics  Cube 


the  throughput  plateaus  at  )J,  and  an 
increasing  fraction  of  incoming  requests 
are  dropped.  This  is  due  to  the  limited 
computing  or  network  infrastructure 
supporting  the  service.  The  finite  queue 
model  simulates  this  effect  by  limiting 
the  number  of  requests/messages  in  the 
queue.  Below  the  maximal  throughput, 
one  may  use  throughput  as  an  approxi¬ 
mate  measure  for  S.  This  is  convenient 
since  commercial  tools  typically  provide 
throughput  as  one  of  the  measures. 

The  contributing  factors  to  availabil¬ 
ity  do  not  change  as  S  increases.  Thus, 
availability  can  generally  be  considered  a 
constant.  However,  at  very  high  levels  of 
S,  the  extremely  high  demand  exhausts 
the  underlying  computing  and  network 
infrastructure  for  the  service,  making  it 
unable  to  perform  any  work  (similar  to  a 
denial-of-service  attack).  This  leads  to 
an  abrupt  drop  in  availability.  For  all 
practical  purposes,  it  should  be  deter¬ 
mined  experimentally  that  this  situation 
does  not  occur  within  the  expected 
range  of  S.  Once  this  is  done,  availabili¬ 
ty  can  be  considered  independent  of  S. 
The  following  discussion  assumes  that 
this  is  true. 

To  help  visualize  the  scalability  behav¬ 
ior  of  a  service,  one  may  define  a  Metrics 
Cube  using  the  minimum  and  maximum 
values  of  S,  T,  and  R.  The  boundary  S_min 
represents  the  threshold  value  and  corre¬ 
sponds  to  the  lower  bound  of  normal 
operation.  S_max,  on  the  other  hand,  is 
the  objective  or  peak  operation  value.  As  S 
increases,  the  state  of  a  service  can  be 
traced  along  a  scalability  curve  in  the  cube, 
as  shown  in  Figure  2.  The  metrics  cube  is 
useful  for  specifying  SLAs.  This  is  dis¬ 
cussed  in  the  next  section. 

SLAs 

The  essence  of  an  SLA  is  to  specify  a 
metrics  cube  and  a  required  availability 
(A)  range,  as  well  as  the  statistical  calcu¬ 
lation  of  the  metrics  (e.g.  average  over 
an  hour,  one  day,  etc.).  A  nominal 
process  of  developing  an  SLA  follows: 

1.  Service  provider  measures  the  scala¬ 
bility  curve  for  a  service. 

2.  Service  consumers  submit  service- 
level  requirements,  which  can  be 
expressed  as  a  metrics  cube  and 
availability  range. 

3.  Service  provider  compares  scalability 
curves  with  the  requirements. 

4.  Service  provider  adds  or  subtracts  com¬ 
puting  resources  to  do  the  following: 

a.  Optimize  the  scalability  curve 
within  the  metrics  cube. 

b.  Meet  the  desired  availability  range. 
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Performance  below  expectation 


Figure  3:  Optimi^tion  of  Scalabilif  Curve 


5.  Service  provider  refines  and  negoti¬ 
ates  the  SLA  with  the  consumers 

(based  on  cost,  schedule,  and  other 

factors). 

In  step  4,  the  scalability  curve  is 
shifted  within  the  metrics  cube  when 
computing  resources  are  changed. 
Figure  3  shows  a  cross-section  of  the  T 
and  S  plane  in  a  metrics  cube.  If  the 
upper  end  of  the  scalability  curve  touch¬ 
es  the  T_max  boundary,  the  SLA  may 
potentially  be  violated  because  the  ser¬ 
vice  time  win  exceed  the  allowed  maxi¬ 
mum  before  S_max  is  reached.  By 
adding  computing  resources  (e.g.  more 
servers),  the  curve  is  shifted  downward. 

However,  if  overdone,  the  curve 
comes  well  below  T_max  at  the  S_max 
boundary,  indicating  over-engineering 
and  wasted  computing  resources;  thus, 
the  optimal  configuration  is  to  have  the 
curve  touch  the  upper  corner  (or  some¬ 
what  below  it  as  a  reserve).  Similarly,  the 
optimal  curve  for  reliability  should 
touch  the  corner  at  (S_max,  R_min).  In 
terms  of  the  metrics  cube  in  Figure  2, 
the  optimized  scalability  curve  would 
touch  the  corner  labeled  with  a  triangle. 

To  be  compliant  with  an  SLA,  the 
service  provider  needs  to  ensure  that 
availability  is  within  the  required  range. 
When  the  service  is  up  and  running,  the 
service  provider  monitors  the  metrics 
and  ensures  that  they  stay  within  the 
required  metrics  cube. 

Closing  Remarks 

This  article  defines  a  set  of  four  refer¬ 
ence  metrics  (collectively  called  TSAR 
metrics)  for  measuring  the  quality  of 
sustained  services  in  SOAs.  It  introduces 
the  concept  of  a  metrics  cube,  which  is 
applied  to  the  development  of  SLAs  and 
capacity  planning  of  computing 
resources. 

For  example,  for  NCES,  a  set  of 
threshold  and  objective  metrics  have 
been  defined.  They  are  the  equivalent  of 
the  minimal  and  maximal  boundaries  of 
the  metrics  cube.  For  NECC,  the  TSAR 
metrics  are  included  in  the  developer’s 
guide  [6]  and  used  in  the  system  engi¬ 
neering  process,  most  notably  for  SLA 
development. 

An  inherent  challenge  for  defining 
end-to-end  metrics  (such  as  the  TSAR 
metrics  in  Table  1)  is  that  they  typically 
have  distributed  contributions  across 
network  and  computing  infrastructures. 
Hence  the  responsibility  for  ensuring 
SLA  compliance  is  shared  by  multiple 
entities  in  a  net-centric  environment 
such  as  the  GIG.  Nevertheless,  the  ser¬ 
vice  provider  is  normally  the  primary 


sustainment  interface  to  its  service  con¬ 
sumers.  The  provider  allocates  metrics 
to  its  dependent  network  and  computing 
service  providers,  either  through  subor¬ 
dinate  SLAs  or  by  explicitly  allocating 
portions  of  a  metric  to  their  responsible 
entities.  The  bases  of  such  allocation  are 
the  formulas  for  combining  metrics 
from  multiple  contributors  (e.g.  service 
providers,  infrastructures).  An  appendix 
in  the  online  version  of  this  article  pro¬ 
vides  such  formulas.  ♦ 
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Note 
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user  account  for  access.  OnUne  appU- 
cation  forms  can  be  found  on  the  sites. 
Some  require  government  sponsor¬ 
ship. 
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Coming  Events 


I 


Web  Sites 


January  7- 1 0 

Hawaii  International  Conference 
on  System  Sciences 
Waikoloa,  HI 

www.cert.org/secure-coding/ 

HICSS-4I-CSSIS  CFP.html 

January  10-12 

The  35‘^  Annual  ACM  SIGPLAN- 
SIGACT  Symposium 
San  Francisco,  CA 

www.cs.ucsd.edu/popl/08/ 

January  10-12 

The  5'*  IEEE  Consumer  Communications 
and  Networking  Conference 
Las  Vegas,  NV 

www.ieee-ccnc.org/2008/ 

January  14-16 

Soldier  Technobgy  US 
Arlington,  VA 

www.soldiertechnologyus.com 

January  22-25 

Network  Centric  Warfare  2008 
Washington  D.C. 
www.ncwevent.com 

January  30-3 1 

75'*  Annual  Multimedia  Computing 
and  Networking 
San  Jose,  CA 

http://mirage.cs.uoregon.edu/mmcn 

2008/ 

April  29-May  2 

^^ystems  &  Software 
Technology  Conference 

Systems  and  Software 
Technology  Conference 
Las  Vegas,  NV 

www.sstc-online.org 


Coming  Events:  Please  submit  coming  events  that 
are  of  interest  to  our  readers  at  least  90  days 
before  registration.  E-mail  announcements  to: 
nicole.kentta@hill.af.mil. 


Scientific  and  Technical 
Information  Network 

http://stinet.dtic.mil 
The  Public  Scientific  and  Technical 
Information  Network  (STINET)  is 
available  to  the  general  public,  free  of 
charge.  It  provides  access  to  citations  of 
unclassified  unlimited  documents  that 
have  been  entered  into  the  Defense 
Technical  Information  Center’s  Tech¬ 
nical  Reports  Collection,  as  well  as  the 
electronic  full-text  of  many  of  these  doc¬ 
uments.  Public  STINET  also  provides 
access  to  the  Air  University  Library 
Index  to  military  periodicals,  staff 
College  Automated  Military  Periodical 
Index,  Department  of  Defense  (DoD) 
Index  to  Specifications  and  Standards, 
and  Research  and  Development  Descrip¬ 
tive  summaries. 

The  Data  and  Analysis 
Center  for  Software 

WWW,  thedacs .  com 

The  Data  and  Analysis  Center  for 
Software  (DACS)  is  a  DoD  Information 
Analysis  Center.  The  DACS  has  been 
designated  as  the  DoD  Software  Infor¬ 
mation  clearinghouse  that  serves  as  an 
authoritative  source  for  state-of-the-art 


software  information  providing  technical 
support  for  the  software  community. 
The  DACS  technical  area  of  focus  is  soft¬ 
ware  technology  and  software  engineer¬ 
ing,  in  its  broadest  sense.  The  DACS  is  a 
central  distribution  hub  for  software 
technology  information  sources.  The 
DACS  offers  a  wide  variety  of  technical 
services  designed  to  support  the  develop¬ 
ment,  testing,  validation,  and  transition¬ 
ing  of  software  engineering  technology. 

Java.net 

www.iava.net 

Java.net  is  the  realization  of  a  vision  of  a 
diverse  group  of  engineers,  researchers, 
technologists,  and  evangelists  at  Sun 
Microsystems,  Inc.  to  provide  a  common 
area  for  interesting  conversations  and 
innovative  development  projects  related 
to  Java  technology.  The  community  con¬ 
tinues  to  grow  with  industry  associa¬ 
tions,  software  vendors,  universities,  and 
individual  developers  and  hobbyists  join¬ 
ing  every  day.  As  they  meet,  share  ideas, 
and  use  the  site's  collaboration  tools,  the 
communities  they  form  will  uncover 
synergies  and  create  new  solutions  that 
render  Java  technology  even  more  valu¬ 
able. 


Announcing... 


i^ystems  &  Software 
Technology  Conference 


2008 


Technology: Tipping  the  Balance 


29  April  -  2  May 
Las  Vegas  Hilton  Resort,  NV 


www.sstc-online.org 
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A  Primer  on  Java  Obfuscation 


Stephen  Torri,  Derek  Sanders,  and  Dr.  Drew  Hamilton  Gordon  Evans 

Auburn  University  Missile  Defense  Agency 

Java  is  not  a  secure  language  and  its  increasing  use  puts  sensitive  information  at  risk.  While  the  authors  do  not  recommend 
Java  software  that  involves  sensitive  information,  the  current  reality  is  that  Java  is  used  in  these  applications.  To  address  this 
reality,  this  article  discusses  Java  obfuscation  techniques. 


In  today’s  software-oriented  world,  soft¬ 
ware  ownership  frequently  changes.  It  is 
difficult,  if  not  sometimes  impossible,  to 
keep  track  of  who  has  a  certain  piece  of 
software  at  any  given  time.  This  presents  a 
problem  for  those  who  wish  to  keep  the 
software  internal  operations  a  secret. 
Languages  such  as  Java,  which  preserves  a 
lot  of  high-level  information  in  its  byte 
code,  present  a  problem  from  the  stand¬ 
point  of  source  ownership  and  securing  it 
from  program  de-compilation.  Releasing 
Java  classes  can  compromise  sensitive  infor¬ 
mation  embedded  in  the  software,  such  as  a 
missile  intercept  computation.  Java  class 
files  contain  the  byte  code  instructions  inter¬ 
preted  by  the  Java  Virmal  Machine  (JVM). 
These  files  are  easily  read  by  programs  that 
can  recreate  a  source  file  from  the  class  file. 
Java  obfuscation  techniques  were  developed 
to  make  reverse  engineering  harder,  but 
many  of  these  techniques  can  be  defeated. 

While  Java  was  developed  to  be  used 
on  embedded  systems,  its  popularity  has 
pushed  it  into  the  public  as  a  mainstream 
language.  It  is  the  view  of  the  authors  that 
if  the  developers  of  a  program  (e.g. 
defense  industries)  do  not  want  its  source 
code  reverse  engineered,  then  Java  should 
not  be  used.  Java  programs  cannot  be  pro¬ 
tected  in  any  manner  from  reverse  engi¬ 
neering.  AH  protection  will  do  is  slow 
down  a  determined  attacker.  In  this  article, 
we  describe  the  three  major  techniques  of 
Java  obfuscation  used  in  present  state-of- 
the-art  tools. 

Commercial  obfuscation  applications 
generally  perform  three  functions  to  secure 
the  Java  source  code.  First,  extra  loops, 
jumps,  or  even  additional  classes  are  added 
to  change  the  control  flow  of  the  program 
so  that  an  attacker  has  extra  difficulty  in 
understanding  the  program.  Second, 
Package/Class/Method/Field  names  are 
renamed  so  they  no  longer  state  what  they 
are  for  (e.g.,  field  named  ‘account_balance’ 
is  now  ‘b’).  Finally,  any  text  strings  con¬ 
tained  in  the  program  are  encrypted. 

Obfuscation  Techniques 

The  following  sections  describe  the  three 


major  techniques  of  Java  obfuscation  used 
in  present  state-of-the-art  tools. 

Control  Flow  Obfuscation 

Control  flow  obfuscation  is  a  technique 
that  makes  use  of  additional  code  and 
looping  it  to  make  it  difficult  to  understand 
what  is  going  on  in  such  a  way  that  causes 
an  attacker  to  give  up  or  confuses  a  tool 
into  producing  undesired  results.  While 
this  strength  is  a  good  attempt  at  protec- 

Control  flow 
obfuscation  is  a 
technique  that  makes 
use  of  additional  code 
and  looping  it  to  make  it 
difficult  to  understand 
what  is  going  on  in  such 
a  way  that  causes  an 
attacker  to  give  up  or 
confuses  a  tool  into 
producing  undesired 
results. 

tion,  there  are  semi-automated  tools,  such 
as  LOCO  [1],  that  allow  a  human  user  to 
interpret  the  code  to  distinguish  between 
useless  code  and  real  code.  While  control 
flow  obfuscation  is  not  foolproof,  it 
increases  the  difficulty  an  attacker  has 
reverse  engineering  a  program. 

Name  Obfuscation 

Name  obfuscation  is  used  to  effectively 
remove  any  information  an  attacker  would 
gain  by  merely  reading  the  name  of  fields. 
For  example,  if  the  original  developer 
used  meaningful  names  to  aid  develop¬ 


ment,  this  would  also  help  the  attacker.  By 
changing  the  names,  the  meaning  of  the 
code  is  harder  to  understand.  This  is  quite 
similar  to  the  problem  of  decompiling  x86 
binaries.  When  decompiling  x86  binaries 
into  an  intermediate  language,  e.g.  Register 
Transfer  Language,  an  attacker  has  to  fig¬ 
ure  out  the  contents  of  the  accumulator 
register  and  how  it  is  used.  This  can  be 
extremely  tedious  but  not  impossible. 
Similarly,  an  attacker  with  a  Java  class  file, 
where  the  names  are  changed  to  simple 
letters  (e.g.  ‘b’  or  ‘cl’),  is  faced  with  a  sim¬ 
ilar  challenge.  The  strength  of  this 
method  is  that  it  removes  a  very  useful 
method  of  program  comprehension  from 
the  hands  of  an  attacker.  However,  its 
weakness  is  that  a  human  using  an  interac¬ 
tive  deobfuscation  environment,  possibly 
a  modified  LOCO  equivalent  program, 
can  discern  what  the  variable  ‘b’  means, 
and  the  program  they  use  could  propagate 
this  new  name  ‘account_balance’  through¬ 
out  the  control  flow  graph  where  ‘b’  is 
used.  Name  obfuscation  raises  the  level  of 
difficulty  in  reverse-engineering,  but  does 
not  make  it  impossible. 

String  Encryption 

String  encryption  is  utilized  as  an  attempt 
to  secure  the  code  for  a  limited  period  of 
time.  The  more  sensitive  the  information 
being  protected,  the  stronger  the  encryp¬ 
tion  should  be.  By  eliminating  another 
source  of  information,  obfuscation  pro¬ 
grams  use  this  technique  to  increase  the 
level  of  difficulty  in  an  attempt  to  prevent 
deobfuscation.  However,  string  encryp¬ 
tion  is  almost  useless  since  the  key  for 
decryption  is  contained  inside  the  pro¬ 
gram  file  unless  using  an  external  key.  It 
has  been  shown  that  attackers  have  already 
discovered  how  to  decrypt  these  strings 
[2],  rendering  this  obfuscation  technique 
almost  useless.  Encryption  is  useful  only  if 
an  external  key  is  used.  This,  however, 
presents  the  classic  key  distribution  and 
management  issue.  Using  an  external  key 
requires  securely  sharing  it  via  some 
mechanism,  which  is  outside  the  scope  of 
this  article. 
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JAVA  Byte  Code 

Java  is  compiled  from  source  files  into 
class  files  containing  byte  codes  that  are 
later  interpreted  or  compiled  into  machine 
code  at  runtime  in  a  JVM.  The  Java  class 
files  present  a  potential  security  problem 
since  simply  compiling  the  Java  source 
code  does  not  do  enough  to  secure  it  from 
being  recovered.  Disassembly  of  the  Java 
byte  code  is  easy  to  do  with  the  tools  pro¬ 
vided  by  Sun  Microsystems  as  a  part  of  its 
software  development  kit  (SDK).  For 
example,  the  following  is  the  classic  Hello 
World  written  in  Java: 

public  class  Hello  { 
public  static  void  main  (  String!] 
args  ) 

{ 

System. out .println ( "Hello 
World") ; 

} 

} 

This  example  simply  prints  out  the 
string  saying  Hello  World  to  the  standard 
console  window.  Compiling  this  program 
with  the  javac  compiler  produces  a  Java 
class  file  called  Hello.class.  This  file  is  used 
with  the  Java  program  to  produce  the 
desired  results.  The  Java  class  file  can  be 
easily  disassembled  into  a  human  readable 
form  using  the  javap  [3]  disassembler  pro¬ 
gram  included  in  the  Sun  Microsystems 
Java  Development  Kit  (JDK). 

For  example,  the  following  is  the  out¬ 
put  of  FleUo.class  after  running  javap: 

Conpiled  from  "Hello. java" 

class  Hello  extends 
j  ava . lang . Ob j  ect  { 

Hello ( ) ; 

Code: 

0 :  aload_0 

1:  invokespecial  #1;  //Method 
j  ava/ lang/ Ob j  ect . " <ini t>" : ( )  V 
4 :  return 

public  static  void 

main  (java .  lang .  String  [] )  ; 

Code: 

0:  getstatic  #2;  //Field  java/ 
lang/ Sys  tern .  out :  L  j  ava/  io/ 
PrintStream; 

3:  Idc  #3;  //String  Hello  World 
5:  invokevirtual  #4;  //Method  java/ 
io/Print S  tream .println : 
(Ljava/lang/String; )  V 
8 :  return 
} 


We  have  recovered  enough  informa¬ 
tion  that  a  developer  with  a  tool  such  as 
the  Dava  decompiler,  (McGill  University’s 
Java  decompiler)  included  in  the  Java  opti¬ 
mization  framework  called  Soot  [4],  can 
quickly  obtain  the  original  source  code 
seen  in  the  following: 

import  j  ava . io . * ; 

class  Hello  { 

Hello ( )  { 
super ( ) ; 

} 

public  static  void 
main  (java.  lang.  String  []  rO)  { 

Sys  tern . out . println 
("Hello  World") ; 

} 

} 

This  example  is  a  simple  one  but  it  illus¬ 
trates  the  point.  It  can  be  seen  that  by 
merely  compiling  a  Java  application  with 
the  Sun  JDK  will  not  offer  any  protection 
against  decompiling  the  program.  This  is 
why  developers  use  obfuscation  in  order 
to  get  some  level  of  protection  against 
reverse  engineering.  Obfuscation  makes  it 
harder,  but  not  impossible,  to  reverse 
engineer  the  code. 

JAVA  Obfuscation 

Obfuscation  works  by  confusing  the  flow 
of  the  source  code  so  it  is  difficult  to 
recover  the  intent  of  it.  However,  in  order 
to  effectively  show  how  obfuscation 
works,  a  complex  example  is  needed.  The 
following  code  is  a  function  that  takes  an 
integer  value  from  the  command  line  as  an 
argument  and  reports  back  the  list  of 
Fibonacci'  numbers.  For  example,  running 
the  command  java  Fibonacci  5  will  give  back 
the  calculated  Fibonacci  number  for  5,  4, 
3,  and  so  on. 

public  class  Fibonacci  { 

public  int  calculate  (int  n) 

{ 

int  output  =  0; 

if  (n  >  1) 

{ 

output  =  calculate  (n  -  1) 

+  calculate  (n  -  2) ; 

} 

else 

{ 

output  =  n; 

} 

return  output; 

} 


public  static  void  main  (  String  [] 
inc  ) 

{ 

if  (inc. length  >  0) 

{ 

Fibonacci  f_ref  =  new 
Fibonacci  ( ) ; 
int  n  = 

Integer. parseint (inc [0] ) ; 
while  (  n  !=  -1  ) 

{ 

Sys  tern . out . println 

("Calculated  fibonacci 
number:  "  + 

f_ref . calculate  (  n  )  )  ; 
n-  ; 

} 

} 

return; 

} 

} 

After  using  javap,  the  byte  code  prior 
to  obfuscation  is  shown  in  the  following: 

public  int  calculate (int)  ; 

Code: 

0 :  iconst_0 
1 :  istore_2 
2 :  iload_l 
3 :  iconst_l 
4 :  if_icraple  26 
7 :  aload_0 
8 :  iload_l 
9 :  iconst_l 
10:  isub 

11:  invokevirtual  #2;  //Method 
calculate: (I) I 
14 :  aload_0 
15 :  iload_l 
16 :  iconst_2 
17:  isub 

18:  invokevirtual  #2;  //Method 
calculate: (I) I 
21:  iadd 
22 :  istore_2 
23:  goto  28 
26 :  iload_l 
27 :  istore_2 
28 :  iload_2 
29 :  i return 

The  commercially  available  obfusca¬ 
tion  program  called  ZeUx  Klassmaster  is 
used  to  obscure  the  names  of  classes, 
methods,  and  variables;  encrypt  any 
strings;  and  complicate  the  control  flow. 
Though  the  nature  of  the  program  is  hid¬ 
den  and  obscured,  the  byte  code  is  stiU 
easy  to  read.  The  important  blocks  of 
obfuscated  byte  code  are  explained  in  the 
following: 

public  int  a(int); 
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Code: 

0:  getstatic  #56;  //Field  A:Z 
3 :  istore_3 

The  previous  lines  are  loading  the  value  of 
a  static  variable  from  the  class  A,  called  Z, 
onto  the  stack.  The  value  is  stored  into  the 
third  local  variable  (var_3  —  A:Z). 

4 :  iconst_0 
5 :  istore_2 

These  instructions  set  the  second  local 
variable  to  zero  (var_2  =  0). 

6 :  iload_l 

This  instruction  loads  the  value  of  func¬ 
tion  parameter  ‘n’  onto  the  stack. 

7 :  iload_3 
8:  ifne  50 

These  instructions  are  checking  to  see 
if  var_3  (the  third  local  variable)  is  not 
equal  to  zero.  If  the  statement  returns  true 
then  it  will  jump  to  label  #50,  otherwise  it 
continues  to  label  #11.  Though  not  seen 
here,  at  label  #50  the  variable  ‘output’  is 
set  to  the  value  of  variable  ‘n’  and 
returned. 

11 :  iconst_l 

12:  if_iCTtple  49 

At  this  point,  the  constant  integer 
value  of  ‘1’  is  loaded  onto  the  stack,  which 
is  used  to  compare  the  value  of  the  previ¬ 
ous  stack  entry  ‘n’  to  1 .  If  ‘n’  is  less  than  1 
then  it  will  jump  to  label  #49,  otherwise  it 
continues  to  label  #15.  Label  #49  is  not 
shown,  but  its  instruction  sets  the  variable 
‘output’  equal  to  the  value  of  variable  ‘n’ 
and  is  returned.  The  two  checks  at  lines  7- 
8  and  11-12  that  were  performed  are  dif¬ 
ferent  from  the  original  check  in  the  un¬ 
obfuscated  code  to  see  if  variable  ‘n’  was 
greater  than  1.  The  obfuscation  program 
has  altered  the  control  flow  in  an  attempt 
to  obscure  the  nature  of  the  function. 

15 :  aload_0 
16 :  iload_l 
17 :  iconst_l 
18:  isub 

19 :  invokevirtual  #2 ;  / /Method 
a:  (1)1 

The  function  a:(I)I  is  the  original  func¬ 
tion  called  calculate  (int  n)  that  returns  an 
integer  result.  This  byte  code  loads  an 
object  reference  to  the  variable  n,  the 
value  of  variable  n  and  a  constant  integer 
value  of  ‘1’  onto  the  stack.  It  then  calcu¬ 


lates  n-1  and  places  the  result  on  the  stack. 
The  call  to  the  function  a:  (I)  with  the 
results  is  the  last  step.  This  is  equivalent  to 
the  function  call  of  ‘calculate  (  |  n  —  1 1  ).’ 

22 :  aload_0 
23 :  iload_l 
24 :  iconst_2 
25:  isub 

26:  invokevirtual  #2;  //Method 
a:  (1)1 

These  instructions  are  similar  to  the 
description  above,  except  the  function  call 
is  equivalent  to  ‘calculate  (n  -  2).’ 

29:  iadd 
30:  istore_2 

At  this  point  the  results  of  ‘calculate  (n 
-  1)’  and  ‘calculate  (n  -  2)’  are  taken  from 
the  stack,  added  together  and  the  result  is 
placed  back  on  the  stack.  This  is  similar  to 
‘calculate  (n-1)  +  calculate  (n  -  2)’.  The 
results  are  stored  in  var_2. 

31:  iload_3 
32:  ifeq  51 

35:  getstatic  #58;  //Field  z:Z 
38:  ifeq  45 
41:  iconst_0 
42 :  goto  46 
45 :  iconst_l 

46:  putstatic  #58;  //Field  z:Z 

Shown  here  is  a  reference  to  the  vari¬ 
able  Z  from  class  z.  However,  notice  that 
the  original  program  did  not  contain  a  sec¬ 
ond  class,  but  the  obfuscator  has  added  it 
to  obscure  the  meaning.  Labels  #31-32 
compare  the  value  of  var_3  to  zero.  If 
var_3  is  equal  to  zero  then  the  value  of 
var_2  (original  variable  called  ‘output’)  is 
returned,  otherwise,  the  comparison  of 
the  variable  ‘Z’  from  the  class  ‘z’  is  com¬ 
pared  to  zero.  If  ‘Z’  is  equal  to  zero  then 
the  value  of  ‘Z’  is  set  to  1,  otherwise  zero. 
These  instructions  are  inserted  by  the 
obfuscator  as  do  nothing  statements  to 
enhance  the  security  and  complicate  deob¬ 
fuscation  forcing  additional  work  to 
obtain  the  original  code. 

49 :  iload_l 
50 :  istore_2 
51 :  iload_2 
52 :  iretum 

Finally,  the  results  of  the  function  call 
are  returned  to  the  original  caller. 

Even  with  obfuscation,  anyone  with 
access  to  thejava  class  files  has  access  to 
the  byte  code  and  hence  is  capable  of  re¬ 
versing  the  obfuscation  process.  The 


LOCO  project,  which  is  designed  to  aide 
a  security  analyst  in  understanding 
obfuscated  code,  could  be  used  for  this 
purpose.  While  the  project  is  designed  to 
look  at  instructions  on  an  x86  architec¬ 
ture,  a  similar  project  designed  for  Java 
byte  code  would  be  much  simpler  to 
implement.  This  is  due  to  the  fact  that 
the  number  of  instructions  that  are  rep¬ 
resented  by  Java  byte  code  is  consider¬ 
ably  less  than  the  number  of  instruc¬ 
tions  for  the  x86  architecture.  The  weak¬ 
nesses  of  obfuscation  as  shown  with 
these  simple  examples  illustrate  the  need 
for  better  protection  against  reverse 
engineering.  In  addition,  the  impact  of 
obfuscation  has  on  the  performance  of 
the  software  must  also  be  analyzed  and 
evaluated  for  acceptability.  While  many, 
if  not  most,  Java  developers  do  not  read 
Java  byte  code,  a  determined  adversary 
can  and  will. 

Cost  of  Obfuscation 

In  order  to  effectively  discuss  obfuscation, 
the  impact  of  obfuscation  on  perfor¬ 
mance  with  normal  operations  can  not  be 
ignored.  Low  [5]  states  that  obfuscation 
should  not  alter  the  behavior  of  the  pro¬ 
gram,  which  is  shown  next: 

Obfuscating  Transformation 

Let  P  ^  P’  be  a  transformation  of  a  source 

program  P  into  a  target  program  P’. 

P4^  P’  is  an  obfuscating  transformation,  if  P 
and  P’  have  the  same  observable  behavior. 
More  precisely,  in  order  for  P  -T>  P’  to  be  a 
legal  obfuscating  transformation  the  following 
condition  must  hold: 

•  If  P  fails  to  terminate  or  terminates  with 
an  error  condition,  then  P’  may  or  may 
not  terminate. 

•  Otherwise,  P’  must  terminate  and  pro¬ 
duce  the  same  output  as  P. 

The  authors  believe  that  changes  to 
the  program’s  control  flow  and  the  use  of 
string  encryption  will  inadvertently  affect 
software  performance.  The  degree  of  the 
impact  depends  on  the  control  flow 
obfuscation  method  and  encryption  algo¬ 
rithm  used.  The  effect  of  name  obfusca¬ 
tion  does  not  impact  the  run-time  perfor¬ 
mance  of  the  system.  To  better  under¬ 
stand  the  impact  of  obfuscation,  it  must 
be  shown  in  terms  of  runtime  in  a  formal 
manner. 

Control  Flow  Obfuscation 

Intuitively,  obfuscating  the  control  flow  of 
a  Java  program  should  incur  some  perfor¬ 
mance  cost  as  it  is  interpreted.  Definition 
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1  defines  a  performance  measure  for  com 
trol  flow  obfuscation  delay. 

Definition  I :  Control  Flow  Obfuscation 
Delay 

Let  T«r  =  L/  +  a  be  an  equation  showing 
the  effects  of  obfuscation  Tocf  on  original 
system  performance  Tcf  by  time  delay  of 
control  flow  obfuscation  a.  If  T.<f  <  T/, 
then  the  obfuscation  has  either  improved 
the  original  performance  of  the  program 
or,  at  a  minimum,  met  the  original  per¬ 
formance.  More  accurately,  the  equation 
is  T.rf  =  T.f  +  a  ,  where  CC  <  0. 
Alternatively,  if  CC  is  greater  than  zero, 
then  the  obfuscation  has  had  a  negative 
effect  on  system  performance. 

An  embedded  system  may  have  hard 
real-time  constraints  which  restrict  how 
much  additional  delay  is  allowed.  By  real¬ 
time  we  refer  to  systems  which  will  fail  if 
the  executing  software  should  miss  a 
deadline.  The  impact  of  obfuscation  on 
the  execution  of  the  program  would 
need  to  be  measured  —  CC  in  the  equation 
above  —  to  determine  if  it  is  at  an  accept¬ 
able  level  that  does  not  degrade  the  sys¬ 
tem  performance  or  user  experience. 
That  is,  if  T^rf  >  Ti,,  where  Ti.  is  a  limit  of 
a  real-time  deadline  or  acceptable  delay, 
then  control-flow  obfuscation  may  pro¬ 
duce  more  harm  than  good. 

String  Encryption 

String  encryption  on  the  other  hand  will 
definitely  not  have  an  obfuscation  effect 
of  zero.  In  the  programs  evaluated  in  [2], 
three  of  them  utilized  string  encryption. 
The  key  used  was  stored  in  the  program 
file  along  with  the  decryption  code.  The 
encrypted  strings  were  either  kept  in  the 
program’s  class  files  or  had  extra  files 
included  in  the  Java  jar  file. 

The  time  delay  caused  by  decryption 
depends  on  the  encryption  algorithm, 
key  length,  and  the  plain  text.  The  origi¬ 
nal  program  had  to  access  the  location  in 
memory,  where  the  original  string  was, 
and  return  it  to  the  place  in  the  program 


it  was  used  (<5  =  Tr,  where  Tr  is  the 
retrieval  time).  Compare  this  to  the  time 
it  takes  to  retrieve  the  encrypted  string, 
perform  the  decryption  algorithm,  and 
return  the  plain  text  string  (S  =  Tr  +  Tj 
+  Tp)  where  Tr  is  the  time  to  retrieve  the 
encrypted  string,  Tj  is  the  decryption 
time,  and  Tpr  is  the  time  to  return  the 
string  to  the  requester.  Therefore,  the 
time  required  to  process  and  return  the 
encrypted  string  should  be  greater  than 
that  of  a  non-encrypted  string. 

Definition  2:  Encrypted  String 
Obfuscation  Delay 

Let  To-  =Tp,  +  5  show  the  effect  of  using 
encrypted  strings,  T«,  on  system  perfor¬ 
mance  using  plain  strings,  Tp,,  by  the  time 
delay  for  encrypted  string  decryption,  5. 
Then  the  time  delay  of  decryption 
should  never  be  zero  (<5  >  0),  therefore. 
To  7^  Tp,,  since  the  act  of  decryption  is 
not  an  act  that  cannot  be  simply  dis¬ 
missed  as  some  that  can  be  ignored. 
Some  amount  of  time  would  be  required 
so  it  is  more  accurate  to  say  T,'  =  Tp,  +  5 
where  5  >  0.  The  same  restriction  as 
described  in  Definition  1  applies.  If  T,  > 
Tl,  where  T.  is  a  limit  of  a  real-time 
deadline  or  acceptable  delay,  then  the 
time  for  decryption  of  the  encrypted 
strings  is  considered  a  hindrance  to 
acceptable  program  operations. 

Combined  Effects  of  Control  Flow 
Obfuscation  and  String  Encryption 

The  total  performance  impact  of  obfus¬ 
cation  can  be  determined  by  combining 
Definitions  1  and  2. 

Definition  3:  Performance  Effect  of 
Obfuscation 

Let  T’  =  T  +  a  +  5  show  the  effect  of 
both  control  flow  (CC)  and  string  decryp¬ 
tion  (S)  have  on  the  original  system  per¬ 
formance.  It  is  important  to  consider 
both  effects  on  performance  since  it  is 
important  to  not  rely  solely  on  one  effect 


for  the  protection  of  a  program.  Three 
effects  shown  will  have  an  effect  on 
security  as  well  as  an  impact  on  perfor¬ 
mance. 

Test  Results 

Four  preliminary  tests  were  conducted  to 
calculate  the  performance  cost  of  vari¬ 
ous  methods  of  obfuscation.  The  tests 
were  conducted  on  a  3GHz  Pentium  4 
system  running  Fedora  Core  6  system 
using  Java  1.6  to  compile  the  program, 
Zelix  Klassmaster  5.0  trial  version  obfus- 
cator,  and  GNU  Compiler  Collection 
4.1.1  20070105  (Red  Hat  4.1.1-51)  to 
compile  the  driver.  A  C++  driver  pro¬ 
gram  was  created  to  run  the  target  Java 
class  file  as  the  ‘root’  user  on  the  system 
for  50  times  and  calculate  the  average 
number  of  central  processing  unit  clock 
cycles  it  took  to  execute  the  target  class 
file.  The  results  for  the  tests  can  be  seen 
in  Table  1. 

The  tests  show  that  even  for  a  simple 
example  the  control  flow  obfuscation 
and  the  string  encryption  has  some 
impact  on  the  performance  of  the  sys¬ 
tem.  None  of  the  obfuscation  methods 
improved  the  performance  of  the  target 
application.  The  impact  of  obfuscation 
must  be  analyzed  as  a  part  of  develop¬ 
ment  in  order  to  measure  the  impact  on 
system  performance  and  user  experi¬ 
ence.  Further  testing  and  refinement  of 
these  metrics  will  provide  a  means  for 
program  managers  to  evaluate  the  per¬ 
formance  costs  of  the  many  different 
Java  obfuscators  on  the  market  (and  in 
the  public  domain.) 

Conclusion 

Obfuscation  is  a  method  (albeit  imperfect) 
to  protect  the  intellectual  property  rights 
of  its  creators.  Obfuscation  could  also  be 
thought  of  as  a  method  of  protection 
against  reverse  engineering  by  making  it 
difficult  for  a  hacker  to  obtain  a  high-level 
representation  of  Java  source  code  in 
order  to  make  changes.  Obfuscation  does 
not  provide  any  sort  of  run-time  protec¬ 
tion  like  watermarking  or  calculated 
checksums  at  periodic  locations. 

Organizations  need  to  consider 
strongly  what  information  is  being 
released  when  a  piece  of  software  is  dis¬ 
tributed.  J/  cannot  be  assumed  that  information 
hard-coded  into  a  program  will  not  be  retrieved. 
This  is  of  considerable  importance  when 
evaluating  software  for  release  through 
foreign  military  sales  or  other  coalition 
partner  arrangements. 

For  those  looking  to  secure  their  soft¬ 
ware,  there  are  professional  tools  available 


Table  1:  Obfuscation  Tests 


Test 

Time 

(cpu  clock  cycles) 

Percentage 

difference 

Unobfuscated  Fibonacci 

2.7069  X  10® 

0% 

Fibonacci  program  with  aggressive 
control  flow  obfuscation 

2.71142  X  10® 

+0.17% 

Fibonacci  program  with  flow 
obfuscation  string  encryption 

2.71478  X  10® 

+0.29% 

Fibonacci  program  with  aggressive 
control  flow  obfuscation  and  flow 
obfuscation  string  encryption^ 

2.71356  X  10® 

+0.24% 
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that  make  claims  of  high  dependability. 
Many  companies  offer  tools  for  both  Java 
obfuscation  as  well  as  .NET  obfuscation. 
Additional  claims  of  these  tools  are  that 
they  reduce  package  size  and  increase  effi¬ 
ciency.  Evaluation  of  these  claims  is  on 
our  list  of  future  work. 

It  is  generally  agreed  that  Java  can  be 
reverse  engineered.  Obfuscation  only  slows 
you  down,  but  obfuscation  also  increases 
the  costs  of  reverse  engineering  sufficient¬ 
ly  to  deter  many  economic  motives  for 
reverse  engineering.  Anyone  who  dismisses 
obfuscation  has  probably  not  tried  to 
reverse  engineer  non-trivial  programs. 
Reverse  engineering  of  militarily  sensitive 
software  is  not  constrained  by  the  same 
economics  as  commercial  software. 

Why  is  Java  used  in  defense  software? 
Reducing  development  costs  is  one  rea¬ 
son.  Often,  after  the  software  has  been 
delivered,  there  are  compelling  reasons  to 
make  the  software  available  under  foreign 
military  sales.  It  is  then  too  late  to  observe 
that  Java  should  not  be  used  and  translat¬ 
ing  millions  of  lines  of  code  of  Java  into 
something  else  is  not  a  feasible  option. 
What  do  you  do?  Obfuscation  certainly 
does  not  solve  this  problem,  but  it  is  an 
option  that  government  program  man¬ 
agers  acquiring  software-intensive  systems 
should  be  aware  of  as  well  as  the  larger 
issue  of  programming  language  selection 
in  terms  of  software  requirements  and 
design.^ 
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Notes 

1.  The  Fibonacci  numbers  are  the 
sequence  of  numbers  {Fn}7  —  1 
defined  by  the  linear  recurrence  equation 


F.  =  F»  /  -H  F»-2  with  Ft  =  F2  =  1 .  As  a 
result  of  the  definition,  it  is  conven¬ 
tional  to  define  Fo  =  0.  (Wolfram  Math 
Word  <http://mathworld. wolfram. 
com/FibonacciNumber.html>) . 


2.  The  average  time  of  aggressive  control 
flow  obfuscation  and  string  encryption 
is  most  likely  due  to  the  fact  that  the 
control  flow  obfuscation  has  be  opti¬ 
mized  in  some  manner. 
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Advancing  Defect  Containment  to  Quantitative 

Defect  Management 


Alison  A.  Frost  and  Michael  J.  Campo 

Buijtheon 

The  defect  containment  measure  is  traditionally  used  to  provide  insight  into  project  success  (or  lack  thereof  at  capturing  defects 
early  in  the  project  life  cycle,  i.e.,  the  time  when  defect  repair  costs  are  at  their  minimum.  Although  the  measure  does  provide 
insight  into  the  effectiveness  of  early  defect  capture  techniques  (such  as  peer  reviews),  defect  containment  in  its  most  common 
form  percentage  of  defects  captured)  is  a  lagging  indicator  as  its  ultimate  value  cannot  be  known  until  a  project  is  complete. 

At  that point,  it  is  too  late  for  a  project  to  take  corrective  action.  Using  raw  defect  containment  data  and  derivingQuantitative 
Defect  Management  (QDM)  measures  early  in  the  development  life  cycle  provides  opportunities  for  a  project  to  identify  issues 
in  defect  capture  before  costs  spiral  out  of  control,  schedule  delays  ensue,  and  another  Death  March  begins  [1], 


Software  quality  issues  have  become  a 
sad  cliche  in  the  software  engineering 
industry.  Versions  1.0  of  commercial  soft¬ 
ware  products  are  notoriously  defect-rid¬ 
den.  Furthermore,  mission  critical  soft¬ 
ware  has  exhibited  spectacular  disasters, 
such  as  the  loss  of  the  Mars  Climate 
Orbiter  when  English  units  were  used  in 
the  coding  of  the  ground  software  file 
used  in  trajectory  models  rather  than  the 
specified  metrics  units  [2].  Ensuring  soft¬ 
ware  quality  in  mission  critical  systems  is  a 
primary  cost  driver  in  software  develop¬ 
ment. 

Several  leading  industry  experts  have 
analyzed  defect  injection  rates  during  soft¬ 
ware  development.  Watts  Flumphrey 
found,  “  ...  even  experienced  software 
engineers  normally  inject  100  or  more 
defects  per  KSLOC  [thousand  lines  of 
code]  into  their  programs”  [3].  Capers 
Jones  gathered,  “A  series  of  studies  found 
the  defect  density  of  software  ranges  from 
49.5-94.6  errors  per  thousand  lines  of 
code”  [4]. 

Compounding  this  situation,  defects 
detected  late  in  the  development  cycle  cost 
many  times  more  to  repair  than  defects 
detected  in  the  stage  they  were  injected.  For 
example.  Watts  Humphrey’s  research  show¬ 
ed  that  the  time  it  takes  to  fix  a  defect  that 
escapes  out-of-stage  as  shown  in  Table  1  [3]. 

Defect  Containment  Basics 

Many  companies  employ  a  defect  contain¬ 
ment  strategy  in  an  attempt  to  reduce  soft¬ 
ware  costs  and  increase  software  quality. 
Programs  and/or  organizations  may  pro¬ 
vide  monthly  resulting  measures  from  this 
strategy  as  part  of  their  team  feedback  or 


management  reviews.  Defect  containment 
divides  the  engineering  development  cycle 
into  separate  stages  and  maps  the  stage  in 
which  a  defect  originated  to  the  stage  in 
which  the  defect  was  detected  (see  Table  2). 

Defects  may  originate  at  any  stage  of 
the  software  development  life  cycle 
(although  usually  the  greatest  percentage 
of  defects  originates  in  the  code  and  unit 
test  stage).  Defects  detected  in-stage  are 
typically  those  defects  detected  during 
peer  reviews  or  unit  tests.  Defects  detect¬ 
ed  out-of-stage  are  those  detected  after 
the  work  product  (e.g,  design  specifica¬ 
tion,  or  code)  has  been  delivered  to  a 
downstream  user  (e.g,  design  released  to 
development  team  or  code  released  to 
software  integration  team).  In-stage 
defects  appear  along  the  diagonal  cells.  (In 
Table  2,  2,421  defects  originated  and  were 
detected  during  code  and  unit  test.)  Out- 
of-stage  defects  appear  in  the  cells  below 
the  diagonal.  (In  Table  2,  1,525  defects 
originated  in  code  and  unit  test,  but  were 
not  detected  until  software  integration.) 
These  defect  data  provide  insights  to  iden¬ 
tify  which  processes  cause  the  most 
defects  and  which  processes  allow  defects 
to  escape. 

Defect  containment  is  usually  reported 
as  percentages  of  defects  captured  in  the 
stage  in  which  they  originated  (see  Table 
3). 

Using  the  data  from  Table  2,  48  per¬ 
cent  of  defects  originating  in  design  were 
detected  (contained)  in  the  design  review 
process;  55  percent  of  defects  originating 
in  code  were  detected  in  the  code 
review/unit  test  process;  and  the  overall 
defect  containment  which  equal  the  total 


number  of  defects  caught  in-stage/total 
defects  for  Table  2  was  the  following: 

(1 ,51 5+1 ,555+2,421  +37+1  +1 0+0)/1 1 ,292 
=  49  percent. 

However,  using  defect  containment  to 
measure  effectiveness  as  a  percentage  of 
in-stage  capture  is  a  lagging  indicator. 
Until  a  project  has  gone  through  the  later 
development  stages,  the  ultimate  number 
of  defects  injected  is  unknown. 
Furthermore,  reporting  superlative  in¬ 
stage  capture  rates  prior  to  qualification 
testing  and  system  integration  can  be  very 
misleading.  The  effectiveness  of  design 
and  code  peer  reviews  is  unknown  until  it 
is  too  late  for  a  project  to  take  action.  As 
such,  traditional  defect  containment 
becomes  a  useful  post-mortem  tool,  but 
does  little  to  help  a  project  when  the  pro¬ 
ject  stiU  has  an  opportunity  to  take  correc¬ 
tive  action. 

Unformnately,  these  two  defect  con¬ 
tainment  matrices  (i.e.  raw  data  count  and 
percentage)  are  where  the  majority  of 
engineers  and  managers  conclude  their 
defect  data  examination.  However,  by 
implementing  a  few  derived  measures 
from  the  defect  containment  base  mea¬ 
sures,  one  can  employ  proactive  QDM. 

QDM 

QDM  predicts  the  number  of  defects 
expected  to  be  detected  in  each  stage  of 
software  development,  enabling  proactive 
measures  to  be  taken  early  in  develop¬ 
ment.  Why  wait  until  system  integration  to 
discover  that  design  and  code  peer  reviews 
were  ineffective?  QDM  allows  a  project  to 
compare  its  defect  detection  rates  against 
similar  projects.  These  predictive  and  lead¬ 
ing  (as  opposed  to  lagging)  software  mea¬ 
surements  provide  a  mechanism  to  deter 
defect-driven  cost  and  schedule  overruns. 
This  measure  can  be  reported  to  a  pro¬ 
gram  and/or  organization  periodically 
(e.g.  monthly)  along  with  the  defect  con- 


Table  1:  Time  to  Fix  Defect  That  Escapes  Stage  (in  hours) 


Requirement 

Design 

Coding 

Development 

Test 

Acceptance 

Test 

During 

Operation 

1 

3-6 

10 

15-40 

30-70 

40-1,000 
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’  SW  =  Software 

Table  2:  Software  Defect  Containment  Matrix 


tainment  measures  for  team  or  manage¬ 
ment  analysis  and  review. 

Benefits  of  QDM  are  the  following: 

•  Using  predictive  defect  measures,  a 
project  knows  in  real  time  if  it  is  meet¬ 
ing  expected  defect  detection  perfor¬ 
mance.  For  example,  if  a  project  is  not 
finding  the  expected  defects  in  design 
and  code  reviews,  managers  should 
investigate  to  determine  if  there  is  a 
reasonable  cause  or  if  corrective 
action  is  needed. 

•  Underperforming  projects  gain  the 
ability  to  make  corrective  actions  early 
rather  than  discovering  problems  at 
the  end  of  the  project. 

•  Overachieving  projects  provide  the 
organization  a  chance  to  share  best 
practices  and  lessons  learned. 

•  Quantitatively  understanding  the  capa¬ 
bility  of  its  peer  review  process  offers 
an  organization  a  chance  to  establish 

®  CMMI  is  registered  in  the  U.S.  Patent  and  Trademark 
Office  by  Carnegie  Mellon  University. 


goals  for  defect  capture  and  preven¬ 
tion,  laying  the  groundwork  for  con¬ 
tinuous  improvement  activities,  and 
establishing  Capability  Maturity  Model 
Integration  (CMMI®)  high  maturity 
processes.  (Please  note:  Although 
QDM  may  be  a  component  of  a 
CMMI  high  maturity  process,  by  itself 
it  may  not  qualify  an  organization  to 
be  rated  CMMI  Mamtity  Level  4  or  5.) 
There  are  five  key  factors  to  take  into 
account  when  applying  QDM.  To  be 
effective,  an  organization  must  do  the  fol¬ 
lowing: 

1.  Utilize  consistent  definitions  for  terms 
such  as  defect,  si^e  unit  (e.g.  source  lines 
of  code  [SLOC]),  and  life-cycle  stages. 

2.  Automate  data  collection  and  report¬ 
ing  to  record  and  track  defect  data. 
Many  change  request  tools  exist  that 
facilitate  the  recording  and  retrieval  of 
in-stage  and  out-of-stage  defect  data, 
as  well  as  automating  the  creation  of 
derived  measurement  charts  for  pro¬ 


jects.  exploitation  of  automation  allows 
projects  to  focus  on  data  analysis  rather  than 
collection. 

3.  Use  past  data  to  analyze  current  perfor¬ 
mance  and  predict  future  perfor¬ 
mance.  Doing  so  allows  one  to  create 
and  maintain  control  limits  based  on 
performance  capability.  One  can  man¬ 
age  based  on  quantitative  analysis. 

4.  Involve  and  train  all  levels  of  personnel. 
Besides  improving  data  integrity,  prac¬ 
titioners’  perspectives  and  analyses  are 
often  found  to  be  the  most  valuable. 
Ownership  of  organizational  goals 
becomes  shared  by  aU  levels  of  per¬ 
sonnel. 

5.  Use  QDM  to  improve  project  and 
organizational  performance,  not  to  tar¬ 
get  individuals.  This  is  true  of  any  mea¬ 
sure. 

QDM  aligns  with  many  industry  initia¬ 
tives.  For  example,  QDM  supports  CMMI 
Level  4  and  Level  5  as  well  as  Six  Sigma 
philosophies  [5]. 
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In  order  to  use  defect  containment  in 
this  predictive  manner,  organizations  must 
do  the  following:  1)  establish  a  baseline  by 
using  defect  containment  data  from  previ¬ 
ously  completed  projects,  2)  normalize  the 
defect  data  found  in  each  stage  by  size  (e.g. 
SLOC),  and  3)  apply  statistical  techniques 
to  set  limits  of  expected  defect  detection 
performance. 

Three  QDM  Measures 

Three  measures  derived  from  the  defect 
containment  matrix  that  offer  immediate 
proactive  insight  into  defect  data  are  the 
following:  1)  Cumulative  Defects 

Originated  in  Design  Detected  by  Stage; 
2)  Cumulative  Defects  Originated  in  Code 
and  Unit  Test  Detected  by  Stage;  and  3) 
Defect  Detection  Distribution  by  Stage. 

The  following  steps  will  cover  required 
base  measures,  establishment  of  control 
limits/boundaries,  and  graphical  represen¬ 
tation  of  data.  (More  measures  can  be 
derived  from  defect  containment  to  offer 
proactive  insight,  but  this  sample  is  a  good 
start  for  a  wide  range  of  software  engi¬ 
neers.) 

In  order  to  create  these  three  QDM 
measures,  the  following  base  and  derived 
measures  are  required: 

•  Number  of  defects  by  stage  of  origin 
(leveraged  directly  from  the  defect 
containment  matrix  [see  Table  2]). 

•  Number  of  defects  by  stage  of  discov¬ 
ery  (leveraged  directly  from  the  defect 
containment  matrix  [see  Table  2]). 

•  A  size  count,  such  as  SLOC  or  func¬ 
tion  points. 

•  Normalize  the  defect  count  by  the  size 
count  (e.g.  X  defects  per  KSLOC). 
Next,  use  comparison  techniques  on 

historical  data  to  establish  the  range  of 
expected  defect  detection,  i.e.,  control  lim¬ 
its.  These  data  must  come  from  indepen¬ 
dent  observations  of  the  same  process 
(e.g.  separate  design  reviews.)  The  data 


from  each  life-cycle  stage  is  compared  to 
data  from  its  own  stage.  In  cases  where  a 
defect  in  an  earlier  stage  causes  a  defect  in 
a  later  stage,  the  defect  counts  as  a  single 
defect  in  the  stage  it  was  originally  intro¬ 
duced. 

Control  limits  can  be  derived  by  calcu¬ 
lating  30  limits  based  on  existing  data.  In 
this  manner,  defect  data  will  fall  between 
these  (30)  limits  99.7  percent  of  the  time. 
Using  3- sigma  limits  avoids  the  need  to 
make  assumptions  about  the  distribution 
of  the  underlying  natural  variation.  As 
noted  by  Florae  and  Carleton  in  the  fol¬ 
lowing  note: 

...  experience  over  many  years  of 
control  charting  has  shown  3- 
sigma  limits  to  be  economical  in 
the  sense  that  they  are  adequately 
sensitive  to  unusual  variations 
while  leading  to  very  few  (costly) 
false  alarms  -  regardless  of  the 
underlying  distribution.  [6] 


Note:  To  calculate  the  control  charts  in 
these  examples,  the  u-chart  formulas  were 
used. 

For  Cumulative  Defects  Originated  in 
Design  Detected  by  Stage  (Figure  1)  and 
for  Cumulative  Defects  Originated  in 
Code  and  Unit  Test  Detected  by  Stage 
(Figure  2),  3-sigma  control  limits  are 
established  using  the  following  u-chart 
formulas: 


UCL  =  u+3 


LCL  =  MAX 


where  ubar  is  the  mean  for  each  subgroup 
and  «,  is  the  sample  size.  An  example  fol¬ 
lows  in  Table  4.  Note:  In  this  example, 
KSLOCs  were  the  size  units  of  the  design 


Table  4:  Cumulative  Defects  Originated  in  Design  Detected  by  Stage  Control  Umits 


Lifecycle  Stage  Where 
Design  Defects  Detected 

Design  Defects/Actual  KSLOC 

Minimum 

Maximum 

Design 

3.7 

9.9 

Code  and  Unit  Test 

4.4 

10.6 

SW  Integration 

4.7 

10.9 

SW  Quaiity  Test 

4.9 

11.1 

System  Integration 
and  Test 

5.4 

11.6 

SW  Maintenance 

5.4 

11.6 

artifacts. 

For  Defect  Detection  Distribution  by 
Stage  (Figure  3),  utilize  a  set  of  greater/ 
less  than  boundaries.  The  number  of 
defects  detected  in  the  software  require¬ 
ments  stage  should  be  less  than  the  num¬ 
ber  found  in  the  design  stage.  The  number 
of  defects  detected  should  continue  to 
increase  through  the  code  and  unit  test 
stage.  After  the  code  and  unit  test  stage, 
the  defects  detected  in  each  stage  should 
decrease  through  the  remaining  stages 
with  software  maintenance  stage  detecting 
the  least  amount  of  defects. 

Finally,  plot  the  data. 

For  the  Cumulative  Defects  Origi¬ 
nated  in  Design  Detected  by  Stage,  create 
a  chart  where  the  x-axis  is  the  life-cycle 
stage  and  the  y-axis  is  the  number  of 
detected  design-originated  defects  nor¬ 
malized  by  size  unit.  Plot  the  total  nor¬ 
malized  number  of  design  defects  found 
in-stage,  followed  by  the  total  cumulative 
numbers  of  design  defects  detected  in 
each  subsequent  life-cycle  stage  (code  and 
unit  test,  software  integration,  software 
qualification  test,  system  integration,  and 
software  maintenance).  Pending  analysis 
preference,  data  points  may  or  may  not  be 
connected  as  a  line  on  the  chart;  in  the 
examples  that  follow,  they  are  connected 
(see  Figure  1). 

For  the  Cumulative  Defects  Origi¬ 
nated  in  Code  and  Unit  Test  Detected  by 
Stage,  create  the  charts  similar  to  the 
process  reviewed  for  Cumulative  Defects 
Originated  in  Design  Detected  by  Stage 
(see  Figure  2). 

For  the  Defect  Detection  Distribution 
by  Stage,  plot  a  chart  where  the  x-axis  is 
the  software  life-cycle  stage  and  the  y-axis 
is  the  normalized  number  of  defects 
detected  in  each  stage,  regardless  of  the 
stage  in  which  they  were  introduced  (see 
Figure  3  for  a  display  of  the  measure). 

For  Defect  Detection  Distribution  by 
Stage,  defect  detection  distribution  ideally 
win  mirror  the  defect  injection  distribu¬ 
tion  (thereby  capturing  defects  as  close  as 
possible  to  when  they  were  injected).  It  is 
known  that  the  defect  injection  rate  maps 
to  the  Rayleigh  distribution  curve  as 
shown  in  Figure  4  [7].  (Statistically,  the 
Rayleigh  distribution  is  a  WeilbuU  Distri¬ 
bution  with  a  value  of  two.)  Therefore,  it 
can  be  used  to  track  the  pattern  of  defect 
removal  during  the  software  life  cycle. 

Analysis  Results 

Analysis  of  the  QDM  measures  indicates 
the  best  course  of  action  for  the  project 
and  organization.  Further,  it  is  important 
to  compare  the  QDM  measures  with 
other  measures  that  the  program  or  orga- 
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nization  maintain  in  order  to  obtain  a 
more  complete  understanding.  Ultimately, 
the  QDM  measures  provide  indicators  for 
further  investigation.  Opportunities  to 
improve  performance  will  vary  among 
projects  and  organizations,  as  shown  in 
the  following  examples: 

•  If  a  current  project  falls  above  the 
upper  control  Umit,  a  course  of  action 
may  be  to  perform  causal  analysis  to 
understand  the  reason  for  the  behav¬ 
ior.  Possible  actions  include  investigat¬ 
ing  means  to  reduce  defects  injected, 
adjusting  control  limits,  and  identifying 
best  practices  for  defect  detection  to 
be  considered  for  organizational 
deployment. 

•  If  a  current  project  falls  below  the 
lower  control  Umit,  a  goal  may  be  to 
get  the  current  project  to  be  as  effec¬ 
tive  (e.g.  during  peer  review)  as  the 
past  projects;  this  would  be  demon¬ 
strated  by  moving  the  project  within 
the  control  Umits  over  time.  In  this 
case,  the  project  may  aggressively  work 
to  improve  design  and  code  peer 
reviews. 

•  Different  opportunities  exist  if  the 
project  data  faUs  within  the  control 
limits.  Options  include  deploying 
defect  prevention  measures  that  drive 
the  data  toward  the  lower  control  Umit 
of  the  charts  illustrated  in  Figures  1 
and  2.  Alternatively,  one  may  choose 
to  gather  a  large  enough  sample  to 
tighten  the  existing  control  Umits  and 
decrease  projected  vatiabUity. 

•  When  looking  at  the  Defect  Detection 
Distribution  by  Stage  measure,  if  the 
project  has  more  defects  detected  in 
the  design  stage  than  the  code  stage 
(the  defect  detection  efforts  during 
code  and  unit  testing  may  have  not 
been  effective),  the  project  may  not  be 
ready  to  begin  the  software  integration 
effort. 

EstabUshing  control  Umits  on  defect 
detection  provides  an  organization  the 
abiUty  to  predict  the  number  of  defects 
that  wiU  be  inserted  into  project  work 
products,  based  on  work  product  size  and 
the  use  of  a  standard  organizational  soft¬ 
ware  development  process.  Predicting 
defects  inserted  within  a  statistically 
derived  range  may  be  used  to  determine 
readiness  to  move  from  one  development 
stage  to  the  next,  and  to  predict  future 
rework  costs. 

Further,  utiUzing  organization  data  or 
industry  standards  on  hours  to  correct  defects 
ly  stage,  return  on  investment  can  be  cal¬ 
culated.  Identifying  peer  review  process  or 
training  issues  can  provide  substantial  sav¬ 
ings  for  minimal  investment. 


Project  Design  Defects 

Defects  Originated  in  Design  Detected  by  Stage 
(Normaiized  by  Deveioped  KSLOC) 

16 - 

14 - 


2 


Design  Code  and  U  T  S/W  Int.  SW  Quality  Sys.  Int.  S/W  Maint 


•upper  Limit  — ■ — No.  of  Defects  •••  Lower  Limit 

Figure  1:  Cumulative  Defects  Originated  in  Design  Detected  bj  Stage 


Project  Code  Defects 

Defects  Originated  in  Code  and  Unit  Test  Detected  by  Stage 
(Normalized  by  Developed  KSLOC) 


^  Upper  Limit  - ■ —  No.  of  Defects  —  «  •  Lower  Limit 


Figure  2:  Cumulative  Defects  Originated  in  Code  and  Unit  Test  Detected  bj  Stage  Chart 


Project  Defect  Distribution 

Defect  Detection  Distributed  by  Stage 
(Goal  is  Rayleigh  Distribution) 
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5.2 
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^  a  ,  ° - 

Requirement  Design  Code  and  Unit  SW  Integration  SW  Quality  System  SW 

Test  Test  Integration  Maintenance 


Software  Life-cycle  Stage 


Figure  3:  Defect  Detection  Distribution  by  Stage 


Some  examples  of  actual  process 
improvements  that  resulted  from  the  use 
of  the  QDM  (implemented  at  Raytheon 
organizations)  include  the  following: 

•  Design  and  code  peer  review  stan¬ 
dards  were  improved,  with  recommen¬ 
dations  of: 

o  Design  peer  review  preparation 
rate  of  less  than  250  SLOG  per 
hour  per  reviewer. 

o  Code  peer  review  preparation  rate 


December  2007 


www.stsc.hill.af.mil  27 


Software  Engineering  Technology 


Get  Your  Free  Subscription 


Fill  out  and  send  us  this  form. 

517  SMXS/MXDEA 
6022  Fir  Ave 
Bldg  1238 

HILL  AFB,  UT  84056-5820 
Fax:  (801)  777-8069  DSN:  777-8069 
Phone:  (801)  775-5555  DSN:  775-5555 

Or  request  online  at  www.stsc.hill.af.mil 

Name: _ 

Rank/Grade: _ 

Position/T  itle: _ 

Organization: _ 

Address: _ 


Base/City: _ 

State: _ Zip: _ 

Phone:( _ ) _ 

Fax:  ( _ ) _ 

E-mail: _ 

Check  Box(es)  To  Request  Back  Issues: 

Sept2006  □  Software  Assurance 
Oct2006  □  Star  Wars  TO  Star  Trek 
Nov2006  □  Management  Basics 
Dec2006  □  Requirements  Eng. 
Jan2007  □  Publisher’s  Choice 
Feb2007  □  CMMI 
Mar2007  □  Software  Security 
Apr2007  □  Agile  Development 
May2007  □  Software  Acquisition 
June2007  □  COTS  Integfiation 
July2007  □  Net-Centricity 
Aug2007  □  Stories  of  Change 
Sept2007  □  Service-Oriented  Arch. 
Oct2007  □  Systems  Engineering 
Nov2007  □  Working  AS  A  Team 

To  Request  Back  Issues  on  Topics  Not 
Listed  Above,  Please  Contact  <stsc. 
customei?service@hill.af.mil>. 


of  less  than  200  SLOC  per  hour 
per  reviewer. 

o  Peer  reviews  meetings  should  not 
last  longer  than  two  hours  [8]. 

•  Peer  reviews  are  postponed  when  par¬ 
ticipation  is  inadequate. 

•  Project  meetings  are  held  to  provide 
feedback  on  QDM  measures,  address 
training,  and  investigate  questions  of 
data  integrity. 

•  Software  measurement  tools  were 
updated  to  improve  automation  of 
data  collection  and  support  analysis. 
The  improved  peer  review  process, 

data  entry  and  analysis,  and  measurement 
automation,  were  direct  results  of  the 
QDM  efforts. 

Conclusion 

QDM  takes  defect  containment  to  a  new 
level  -  from  a  reactive,  lagging  indicator  to 
a  proactive,  predictive  indicator  of  soft¬ 
ware  quality.  Samplings  of  derived  defect 
measures  with  steps  on  how  to  create 
them  were  offered.  QDM  analyses  pro¬ 
vide  an  array  of  opportunities  for  process 
improvements  that  increase  quality  and 
reduce  costs  at  both  project  and  organiza¬ 
tional  levels.  ♦ 
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BackTalk 


I  Want  My  BACKTALK  Back 


I  just  received  notice  from  CrossTalk’s 
managing  editor  requiring  a  BACKTalk 
article  for  the  December  issue.  Now! 

I  apologize  in  advance  for  the  lack  of 
preparation.  A  previous  message  indicated 
that  BackTalk  was  covered,  and  Dr.  Cook 
and  I  could  take  a  couple  of  issues  off  In 
reviewing  the  message,  the  two  issues  were 
January  and  February,  and  I’m  on  the  line  for 
December. 

No  problem.  I’ve  had  short  deadlines 
before.  I  work  well  under  pressure.  Let  the 
creative  juices  flow  from  the  cortex  to  the 
fingers  through  the  keyboard  to  the  screen 
past  the  editor  to  the  page. 

Wait,  what  does  the  last  sentence  of  the 
message  say? 

“It  needs  to  be  short,  as  we  are  printing 
the  article  index  -  so  about  half  of  what  you 
typically  write.” 

Are  you  kidding  me?  They  are  cutting 
BackTalk  short  for  an  article  index?  Do 
they  really  think  CrossTalk  readers  wait 
with  baited  breath  to  meticulously  browse 
the  article  index?  Do  they  realize  this  is  the 
2T'  century  and  indexes  belong  on  the  Web? 

It  needs  to  be  short?  Meetings  need  to  be 
short.  Commercials  need  to  be  short. 
Queues  need  to  be  short.  Computer  start-up 
times  need  to  be  short,  not  BACKTalk.  It’s 
already  concise  and  to  the  point. 

Would  they  treat  Gustave  Eiffel  this  way? 
“Hey  Gustave,  beautiful  tower,  can  you  make 
that  half  as  high?  We  don’t  want  to  distract 
the  tourists  ...  merci.” 


They  obviously  got  to  da  Vinci.  “Hey 
Leonardo,  here’s  an  offer  you  can’t  refuse.  I 
want  Mona’s  beaming  smile  diluted  to  a  half 
a  grin  ...  grazie.” 

Half  of  what  I  typically  write?  Would  you 
ask  Dennis  Miller  for  half  a  rant;  Krispy 
Kreme  for  half  a  doughnut;  Britney  for  half 
a  rehab  -  okay.  I’ll  give  you  that;  Lance  for 
half  an  effort;  the  Wizard  for  half  a  brain, 
heart,  or  courage?  Would  you  ask  A1  Gore 
for  half  a  carbon  footprint?  Wait,  we  did  and 
he  won’t.  Hey,  there’s  an  idea,  if  I  buy  an  arti- 
de  index  offset  can  I  take  more  space  for 
BackTalk? 

I  understand  half  the  calories,  half  the 
wait,  or  a  half-off  sale,  but  half  a  BACKTalk 
makes  no  sense.  What  are  they  thinking?  Can 
you  imagine  the  CROSSTALK  boardroom 
conversation?  “We  need  space  for  the  annu¬ 
al  article  index.  Where  can  we  find  space?” 

‘What  about  BACKTalk?” 

Sting  faindy  singing:  I  want  my  BACK¬ 
TALK  hack. 

Synthesizer,  drums,  guitar.  Dire  Straits 
sings: 

Look  at  them  yo-yo ’s,  that’s  the  way  to  do  it; 

You  write  BackTalk  for  all  to  see. 
That  ain’t  workin’,  that’s  the  way  you  do  it. 
Article  for  nothing  and  your  kicks  for  free. 

Now  that  ain’t  workin’. 

That’s  the  way  you  do  it. 

Let  me  tell  ya  —  them  guys  ain’t  dumb 
Maybe  get  a  blister  on  their  little  finger 


Maybe  get  a  blister  on  their  bum 

We  gotta  install  article  indexes. 

Custom  issue  deliveries 

We  gotta  move  that  impersonator. 

We  gotta  move  that  Ph.D 

Sting  boldly  singing:  I  want  my,  I  want  my,  I 
want  my  BACKTalk  hack. 

Let’s  be  honest:  We  all  know  that  nine  out 
of  10  Crosstalk  readers  turn  to  BACK¬ 
TALK  straight  away  for  wit,  indulgence,  and 
inspiration.  While  the  Publisher’s  Note  intro¬ 
duces  the  issue,  BackTalk  sets  the  tone  - 
warming  up  a  reader’s  mind  in  preparation 
for  the  technical  feast  inside.  If  they  think 
they  can  get  away  with  a  half-baked 
BackTalk  ... 

Knock,  knock,  knock. 

Pardon  me  a  moment  to  get  the  door. 

Yes?  Excuse  me,  are  you  arresting  me? 
What  have  I  done?  What  have  I  done?  What 
have  I  done!  You’re  arresting  me?  Whoa, 
whoa,  whoa,  get  off  me.  Get  off  me!  Help! 
Why  are  they  arresting  me?  What  did  I  do? 
Get  off  me!  Get  off  me!  I  didn’t  do  anything! 

We  now  return  to  your  regularly  sched¬ 
uled  (half)  BackTalk  (soft,  soothing 
elevator  Muzak). 

—  Gary  A.  Petersen 

Arrowpoint  Solutions,  Inc. 
gpetersen@arrowpoint.us 


Monthly  Columns 

ISSUE 

COLUMN  TITLE 

AUTHOR 

Issue  1:  January 

Publisher’s  Choice 

Sponsor:  Unit  Compliance  Inspection:  What  Did  We  Learn? 

Publisher:  Choose  Your  Favorite 

BackTalk:  One  If  By  LAN,  Two  If  By  C 

Diane  E.  Suchan 

Elizabeth  Starrett 

Dr.  David  A.  Cook 

Issue  2:  February 

CMMI 

Sponsor:  Axiomatic  Improvement 

BackTalk:  Hippocrates  and  the  Oath 

Randy  B.  Hill 

Gary  A.  Petersen 

Issue  3:  March 

Software  Security 

Publisher:  Collaborating  for  Secure  Software 

BackTalk:  Project  Management  Using  Random  Events 

Elizabeth  Starrett 

Dr.  David  A.  Cook 

Issue  4:  April 

Agile  Development 

Sponsor:  “Lead,  Follow,  or  Get  Out  of  the  Way" 

BackTalk:  (Un)  Due  Diligence 

Kevin  Stamey 

Dr.  David  A.  Cook 

Issue  5:  May 

Software  Acquisition 

Sponsor:  Being  a  Smart  Buyer 

BackTalk:  Who  Are  Those  Guys? 

Tony  Guido 

Gary  A.  Petersen 

Issue  6:  June 

COTS  Integration 

Sponsor:  Navigating  the  COTS  Sea 

BackTalk:  COTS:  Commercial  Off-The-Shelf  or  Custom  Off-The-Shelf? 

Diane  E.  Suchan 

Wiley  F.  Livingston,  Jr.,  PE 

Issue  7:  July 

Enabling  Technologies  for  Net-Centricity 

Sponsor:  Delivering  the  Power  of  Information 

BackTalk:  Net-Centric  Virtuosity 

General  James  E.  Cartwright 
Gary  A.  Petersen 

Issue  8:  August 

Stories  of  Change 

Sponsor:  The  Right  Way  to  Change 

BackTalk:  Common  Threads  in  Life 

Norman  R.  LeClair 

Glenn  Booker 

Issue  9:  September 

Service-Oriented  Architecture 

Publisher:  SOA  Provides  Opportunities  and  Challenges 

BackTalk:  Evolution  in  Action  -  Building  Up  to  a  Service-Oriented  Architecture 

Elizabeth  Starrett 

Dr.  David  A.  Cook 

Issue  10:  October 

Systems  Engineering 

Sponsor:  Revitalization  of  Systems  Engineering  Within  the  Department  of  Defense 
and  the  Expanding  Role  of  Software 

BackTalk:  Softwareitaville 

Dr.  John  W.  Fischer 

Gary  A.  Petersen 

Issue  1 1 :  November 

Working  as  a  Team 

Sponsor:  Working  as  a  Team 

BackTalk:  SSMART  Team  Management 

Kevin  Stamey 

Dr.  David  A.  Cook 

Issue  12:  December 

Software  Sustainment 
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